Doc/Articles/Play181
At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter
28. Boggle
. By Eugene McDonnell. First published in Vector, 18, 1, (July 2001), 91-102.
I suppose many of you have played as a child with a set of blocks, wooden cubes about an inch and a half on a side, with pictures, letters, numbers, and designs on the faces. Did you ever set the alphabet faces in a row to spell a word? Suppose you had such a set of blocks in which all of the faces had letters on them, and that you had a tray divided by partitions to form rows and columns, giving cells into which the blocks just fitted. Now if you jumble up your blocks in a bag, then take out a block at a time and with your eyes closed, put it securely in one of the cells at random until they are all full, you will find when you then open your eyes that the letters that are face up will be in all different orientations. Now if you are given the task of finding among these as many words of four or more letters formed among blocks that are connected either by an edge or a corner within three minutes, you will have some idea of how to play a word game I very much like.
The game comes in two forms; the original game, the one I cut my teeth on, is called Boggle™, and is played on a 4-by-4 tray. The other game, which I now favour, and which appeared several years after Boggle was introduced, is called Big Boggle™, and is played on a 5-by-5 tray. The blocks are miniature, about five-eighths of an inch on a side and made of plastic, as is the tray. There is also a clear plastic dome which fits snugly over the tray, permitting the whole to be turned upside down, shaken, and turned upright, so that with a little jiggling each block nestles upright in one of the cells. The game comes with a three-minute egg-timer and a little sheet of instructions. The letters on each of the sixteen Boggle cubes and the 25 Big Boggle cubes are given by the columns of the tables below. They are shown as lower-case here, but in the game they appear as capitals. The letter shown as 'q' is actually the digraph 'qu', and if used in a word counts as two letters:
aaaaacdddeeeeehh aaaaaaaabccccddddeeefgino abcfoieeieehilil fdeaeaeajceeehhdhnmiiopoo ebhfomilsgirormn ieeageefkeiiillhhsoiprroo ejoktolrthntstnn rnefmegiqnlipnnnlstirrrtt goppttrvtnsvstqr snernemrxsplsoooosttsvrut nosswuxyywuwtyuz ynmsneuszttttrrtruttywywu
Here are the rules supplied with the game:
. Object: To list, within 3 minutes, as many words of the highest point value as you can find among the random assortment of letters in the cube grid. . Preparation: Each player should have a pencil and a piece of paper. Drop the letter cubes into the dome and place the grid, open side down, over the dome. Turn the domed grid right-side up, vigorously shake the cubes around, and maneuver the grid until each cube falls into place. Then, as one player removes the dome, another player starts the timer. . Playing: When the timer starts, each player searches the assortment of letters for words of four letters or more. When you find a word, write it down. . Words are formed from adjoining letters. Letters must join in the proper sequence to spell a word. They may join horizontally, vertically or diagonally to the left, right, or up-and-down. No letter cube, however, may be used more than once within a single word. . Type of words allowed: The only words that are allowed are those that can be found in a standard English dictionary. You may look for any type of word—noun, verb, adjective, adverb, etc. Plural nouns are acceptable as are all verb tenses. Words within words are also allowed, e.g., master: mast, aster. . Type of words not allowed: Proper nouns, abbreviations, contractions, hyphenated words, and foreign words that are not in an English dictionary. . Scoring and winning: When the timer runs out, everyone must stop writing. Each player in turn then reads aloud his or her list of words. Any word that appears on more than one player's list must be crossed off all lists, including that of the reader. The same word found by a player in different areas of the grid may not be counted for multiple credit. . After all players have read their lists, each player scores his or her remaining words, differing point values accorded to the words according to their lengths, as follows:
no. of letters 4 5 6 7 8+ points 1 2 3 5 11
The winner is a) the player whose words have earned the most points, or b) the first to reach 50 points, 100 points or whatever score is considered by all to be a reasonable target.
I usually play until at least one player has reached or passed 100 points. I've played the game with three or four players, but prefer the two-person game. My wife and I have established several additional house rules. Since I frequently wrote down spurious words, my wife insisted that there be a penalty for such. Thus, any word may be challenged. If it is not found in the dictionary the player is given a score of -1 for it. If it is in the dictionary, the player gets an additional point for it. We began by using our huge unabridged Merriam-Webster dictionary, but this, my wife claimed, gave me an unfair advantage, since I frequently wrote down archaic Scottish words and the like that were in this dictionary but not in smaller ones. We then began using the Concise Oxford Dictionary (COD), but had to give up on that, too, since it favoured English words and English spellings. My wife was tired of me putting down words like twee and nous that are unknown on our side of the Atlantic. We now use the American Heritage Dictionary (AHD), Ken Iverson's favourite. Here is a sample grid:
t i n e n i n t o c n a r e t l
These letters are shown in normal position; in practice they can have any of the four possible orientations. Try your hand at finding words in this grid. Remember, you have just three minutes. You can see the words I found, unaided by computer, at the end of this paper.
The Problem
Twenty years ago, when I was Recreational APL editor of APL Quote Quad, I received a letter from Robert Ashworth, of Carbondale, Illinois, asking if I could write a column on the Boggle game. I was agreeable to the extent of posing these problems to my readers (this was before Big Boggle was born, so assumes the smaller Boggle situation):
1. Find the number of paths of length 4 in the grid. 1. Find the number of paths of length 5. 1. Write a suite of functions that, given a 4-by-4 character table, finds all words of length 4 and 5, using the rules of Boggle. Assume the existence of two tables w4 and w5 containing all the acceptable words of length 4 and 5, respectively.
I thought this was the most difficult problem I'd ever proposed. Furthermore, at the time I had only vague ideas of how to go about solving it. Recently, nagged by this unfinished business, I revisited the problem, with a degree of success.
A Boggle path is a sequence of distinct connected cells, connected in the Boggle sense. An interior cell is connected to the eight surrounding cells. An edge cell, not on a corner, is connected to the five surrounding cells. A corner cell is connected to the three surrounding cells. It may be suitable at times of to use the cell's list indices:
] m=: i. 4 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
and at other time their row-column indices:
] q=: <"1 (4 4#: m) +---+---+---+---+ |0 0|0 1|0 2|0 3| +---+---+---+---+ |1 0|1 1|1 2|1 3| +---+---+---+---+ |2 0|2 1|2 2|2 3| +---+---+---+---+ |3 0|3 1|3 2|3 3| +---+---+---+---+
Normalizing any 3-by-3 portion of the row-column index grid by subtracting the central item from each of the items, shows that two cells are connected if the maximum magnitude of the difference of their row-column indices is 1.
(] -&.> (<1 1)"_ { ])3 _3{.q +-----+----+----+ |_1 _1|_1 0|_1 1| +-----+----+----+ |0 _1 |0 0 |0 1 | +-----+----+----+ |1 _1 |1 0 |1 1 | +-----+----+----+
The only way I know to find the number of paths for different cases is by constructing the paths. So to solve problems a and b above implies finding the paths themselves.
An easy but expensive way is to find a superset of the paths, by taking all the combinations of sixteen things taken four at a time, that is, 4!16 or 1,820, then get each of the 24 permutations of every combination, giving (!4)*(4!16) or 43,680 four-item lists, then select from the table all rows giving Boggle-connected paths. The verb comb is due to Roger Hui, and it and its component parts are given at the end of this paper. The function paths has syntax r =: n paths k and gives the paths of length n in a k-by-k grid.
paths=: dyad define NB. find all paths of length x in a grid of size y * y NB. px is a table of all the permutations of length x px=. (i. !x)A. i. x NB. cxn is a table of all the combinations of y things taken x at a time cxn=. x comb *: y NB. pc is a table of all the permutations of each of the combinations pc=: ,/px{"2 1 cxn NB. (i{mpc) is 1 if (i{cxn) is Boggle-connected and 0 otherwise mpc=. y okf pc NB. cxno is all the paths of length x in a y*y grid cxno=. /:~ mpc#pc )
This is the okf function:
okf=: 13 : '1=>./,|2-/\(2#x)#:y'"1
Given a list y, this takes the row-column representation ((2#x)#:y) of each item, the difference of pairs of successive representations (2-/\), their magnitudes (|), ravels these (,), finds their maximum (>./), compares this to 1 (1=), yielding 1 if the list is Boggle-connected, and 0 otherwise. For example, the number of paths of length 3 in a 4-by-4 grid is given by:
#3 paths 4 408
Here are four successive items from pc after running #4 paths 4 :
37 38 39 40{pc 2 0 4 1 2 1 0 4 2 1 4 0 2 4 0 1
And here is the result of applying 4 okf to each of these rows:
4 okf 37 38 39 40{pc 0 1 1 0
If you look again at m you can verify that 2 0 4 1 and 2 4 0 1 are not Boggle-connected, but 2 1 0 4 and 2 1 4 0 are:
m 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Don't bother to use this function. It takes an unbearably long time as the path lengths get just a little larger. It is intended only to let you know how my thinking was going.
An easier way to get the paths is to build them up starting with the n^2 paths of length 1, and extending these only with promising items. At this point I showed my astuteness by sending a message to Roger Hui explaining what I was doing, and asking him if anything bright in a combinatorial way occurred to him. I received, in rapid succession, four replies, written while he was babysitting his son Nicholas as he was in the middle of moving from Toronto to Vancouver. I'll pass over the first three messages because, as usual with Roger, one idea suggested a better idea. Here is his last effort:
rimb=: _1 ,. (_1 , ] , _1:) ,. _1: tileb=: 3 3 &(,;._3) nborsb=: (4 1 4#1 0 1)&#"1 @ (,/) @ tileb @ rimb @ i. @ ,~ initb=: ,. @ i. @ *: extendb=: 8&#@] ,. [: , {:"1@] { [ testb=: *./"1@(0&<:) *. i.@{:@$ -:"1 i."1~ stepb=: (testb # ])@extendb pathb=: 4 : '(nborsb x) stepb^:(y-1) initb x'
The pathb dyad has syntax z =: n pathb k , and yields a table with k columns, with rows giving all the paths of length k to be found in an n-by-n grid. It begins by using initb to form a seed table having all possible starting cells of paths, namely, a single column with values i. *: n . Each use of its step dyad, beginning with the seed table, extends its argument, a table of paths of length k-1, to form the table of paths of length k. The stepb dyad requires as its left argument a table with eight columns and as many rows as there are cells in the grid. Each row contains a list of all the neighbours of each cell. In the case of edge cells, which have only five neighbours, and corner cells, with only three, the lists are filled out with negative-one values. The nborsb monad builds this table by forming an n-by-n grid with (i.@,~) and using the rimb monad to border this with negative ones. This is tesselated into raveled 3-by-3 squares with the tileb monad. Ravel (,) and insert-ravel (,/) make the individual tables into one large table. To complete the table the central column is deleted with (4 1 4#1 0 1)&#"1 . Here is what this table for the 4-by-4 case looks like:
nborsb 4 _1 _1 _1 _1 1 _1 4 5 _1 _1 _1 0 2 4 5 6 _1 _1 _1 1 3 5 6 7 _1 _1 _1 2 _1 6 7 _1 _1 0 1 _1 5 _1 8 9 0 1 2 4 6 8 9 10 1 2 3 5 7 9 10 11 2 3 _1 6 _1 10 11 _1 _1 4 5 _1 9 _1 12 13 4 5 6 8 10 12 13 14 5 6 7 9 11 13 14 15 6 7 _1 10 _1 14 15 _1 _1 8 9 _1 13 _1 _1 _1 8 9 10 12 14 _1 _1 _1 9 10 11 13 15 _1 _1 _1 10 11 _1 14 _1 _1 _1 _1
The table for the 5-by-5 case is similar, but with 25 rows instead of 16.
At each step, the last item of each of the current paths is used to select the appropriate row from the neighbours table, that row is replicated eight times, and each neighbour is appended to one of the eight replicated rows, using the extendb dyad. The testb monad removes illegal rows from this extended table, ensuring that no item is duplicated, and no negative ones are present in the result. This process continues for k-1 steps, at the end of which we have obtained the desired table.
The pathb function executes 4 pathb 4 in 0.02 seconds, and 4 pathb 5 in 0.09 seconds on my 233mhz computer. With its help I found the answers to problems a and b:
#4 pathb 4 1764 #4 pathb 5 6712
The time for my paths function to do 4 paths 4 is 43.4 seconds. I don't have the patience to try cases involving longer paths.
Now it's time to solve problem c. I have carried about with me for about 15 years word lists from the American Heritage Dictionary. The original data came from the publisher Houghton-Mifflin on a single file on a magnetic tape which Joey Tuttle had mounted on a tape drive in the Toronto machine room of I. P. Sharp Associates and read into the Amdahl V8 then in use there. He processed this file in several ways, the one of most use to me being sorted lists of words all of the same length, which I have named Words02 through Words26. These are now resident on a Macintosh computer in my home. With the tools you've seen developed, you could probably arrive at a solution yourself. Assume that you have b, the ravel of the letters in the grid, p the table of paths, and w the list of words, all for a given length. Then
r=: /: ~ ~. (w e. p { b) # w
will give r, a sorted table of all the words of a given length, with no duplicates, satisfying a legal path on the grid. Before showing the results I obtained using the phrase above, I give first the words I found unaided by computer in this grid:
t i n e n i n t o c n a r e t l
ante alter cental cent inner lancer coin inter lancet core lance lanner lane later recoin late nance rennet nice nicer rental nine octal tanner rent renal tannin tine tinner tint
All of these are in COD. All but "nance" and "recoin" are in AHD. And here are those found by the computer:
anne alter alnico ante ancon cental cent anent cetane cero anion encina coin conti encore core creon innate etna enate lancer icon inane lancet lane inner lanner late inter latent neon lance nocent nero later octane nice nicer octant nine renal rectal nore renan rennet once renin rennin reni rente rental rent tater tanner tate tinct tannic tent tannin tine tenant tint tinner
The results of me versus the computer are: 4-letter words, 11 vs 16; 5-letter words, 9 vs 16; and 6-letter words, 10 vs 22. The second table shows a defect of my word collection. The tape we got from Houghton Mifflin includes biographical and geographical entries, and these have not been removed. I've put in italics those words, which, by Boggle rules, are not legal words. Also, COD has some words not in AHD and vice-versa. For example, "cero" (a western Atlantic fish) is in AHD, but not in COD, and "recoin" is in COD but not AHD.
There are longer words in the list. How many of you found "continent"? And "continental"? And the 16-letter behemoth "intercontinental"? Of course, I contrived this case, choosing a suitable looking word from my word16 file, in order to fill a 4-by-4 grid completely.
Here is an incomplete table of the number of paths of given lengths in grids from size two to five:
paths =: 0 : 0 The number of paths of length k connecting distinct adjacent points in a square grid of n * n points: k\n 2 3 4 5 1 4 9 16 25 2 12 40 84 144 3 24 160 408 768 4 24 496 1764 3768 5 1208 6712 17280 6 2240 22672 74072 7 2984 68272 296360 8 2384 183472 9 784 436984 )
I've been unable to find the law of this table. Row 1 are the squares, of course. Row 2 is four times the alternate triangular numbers (3 10 21 36). Beyond that deponent witnesseth not. The columns headed 2 and 3 are complete, but even with Hui's much more efficient functions, larger cases for columns 4 and 5 are still unattainable with my computer (and my patience).
There are two ways the task can be made more efficient: first, instead of using pathb, which necessitates going back to the beginning for each case computed for a given grid size, the previous results can be stored, allowing case k to be computed by using the stepb function on the k-1 result; and second, by taking advantage of the symmetries of the square grid. The number of paths of a given length proceeding from any corner point are the same; similarly for symmetrically located edge points and interior points. The symmetries are evident if the number of paths beginning from each grid point are displayed. For example, here are the number of paths from each point for a 4-by-4 case and a 5-by-5 case:
4 4$#/.~{.|:4 pathb 5 322 435 435 322 435 486 486 435 435 486 486 435 322 435 435 322 5 5$#/.~{.|:5 pathb 6 1874 2752 2998 2752 1874 2752 3524 3672 3524 2752 2998 3672 3784 3672 2998 2752 3524 3672 3524 2752 1874 2752 2998 2752 1874
In general, for a grid of side n, containing n^2 points, there are only g(n) distinct numbers of paths, where g is
g=: 2&!@>:@>.@-:
Here are the results of g applied to grid sizes one through 8:
(,:g) 1+i.8 1 2 3 4 5 6 7 8 1 1 3 3 6 6 10 10
So that, for the 4-by-4 case, only three values have to be computed, not 16, and for the 5-by-5 case, only six, not 25. The total number of cases for a given path length k in a grid of size n can be obtained by an inner product:
4 4$#/.~{.|:4 pathb 4 75 109 109 75 109 148 148 109 109 148 148 109 75 109 109 75 4 8 4+/ . * 75 109 148 1764 #4 pathb 4 1764
When the actual paths are needed, the additional cases can be obtained from the abbreviated tables by using the appropriate indices to select from them, the indices being the permutations obtained from the ravels of the rotations and reversals of the table i.(n,n) . Here is how this is done:
NB. get neighbours table n4=: nborsb 4 NB. form monad to step previous case by bonding stepb's left argument f=: n4&stepb NB. initial case for abbreviated situation ] ip=: ,.0 1 5 0 1 5 NB. Get all paths for abbreviated argument $p42=: f ip 16 2 NB. Only sixteen cases of the 84 obtained using full argument $4 pathb 2 84 2 NB. Use phrase 7.b.d8 to obtain desired permuted lists of all cell numbers ] allpcn=: ,"2 (i.8)d8"0 2 i.4 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 12 8 4 0 13 9 5 1 14 10 6 2 15 11 7 3 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 3 7 11 15 2 6 10 14 1 5 9 13 0 4 8 12 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 15 11 7 3 14 10 6 2 13 9 5 1 12 8 4 0 3 2 1 0 7 6 5 4 11 10 9 8 15 14 13 12 0 4 8 12 1 5 9 13 2 6 10 14 3 7 11 15 NB. Get all length-2 paths (with possible duplicates) $q42=: ,/p42{"1/allpcn 128 2 NB. Get ordered set of all length-2 paths $s42=: /:~ ~. q42 84 2 NB. compare ordered set to long-winded but accurate set (4 pathb 2) -: s42 1
Appendix
Hui's combinations suite:
startc=: i.@-.@- countc=: <:@[ ! <:@[ + |.@startc indexc=: ;@:((i.-])&.>) recurc=: (countc#startc) ,. (indexc@countc{comb&.<:) testc=: *@[ *. < basisc=: i.@(<: , [) comb=: basisc`recurc @. testc
Rotation/reflection suite (from J Phrases, 7.B.d8):
m0=: ] NB. Identity m1=: m6@m7 NB. Three-o’clock rotation m2=: m4@m6 NB. Six-o’clock rotation m3=: m4@m7 NB. Nine-o’clock rotation m4=: |.@] NB. Horizontal reflection m5=: m2@m7 NB. Counterdiagonal reflection m6=: |."_1@] NB. Vertical reflection m7=: |:@] NB. Diagonal reflection d8=: m0`m1`m2`m3`m4`m5`m6`m7 @. [ NB. i d8 y gives mi y (all rotates and reflects)