Doc/Articles/Play181

From J Wiki
Jump to navigation Jump to search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

28. Boggle

. By Eugene McDonnell. First published in Vector, 18, 1, (July 2001), 91-102.

I suppose many of you have played as a child with a set of blocks, wooden cubes about an inch and a half on a side, with pictures, letters, numbers, and designs on the faces. Did you ever set the alphabet faces in a row to spell a word? Suppose you had such a set of blocks in which all of the faces had letters on them, and that you had a tray divided by partitions to form rows and columns, giving cells into which the blocks just fitted. Now if you jumble up your blocks in a bag, then take out a block at a time and with your eyes closed, put it securely in one of the cells at random until they are all full, you will find when you then open your eyes that the letters that are face up will be in all different orientations. Now if you are given the task of finding among these as many words of four or more letters formed among blocks that are connected either by an edge or a corner within three minutes, you will have some idea of how to play a word game I very much like.

The game comes in two forms; the original game, the one I cut my teeth on, is called Boggle™, and is played on a 4-by-4 tray. The other game, which I now favour, and which appeared several years after Boggle was introduced, is called Big Boggle™, and is played on a 5-by-5 tray. The blocks are miniature, about five-eighths of an inch on a side and made of plastic, as is the tray. There is also a clear plastic dome which fits snugly over the tray, permitting the whole to be turned upside down, shaken, and turned upright, so that with a little jiggling each block nestles upright in one of the cells. The game comes with a three-minute egg-timer and a little sheet of instructions. The letters on each of the sixteen Boggle cubes and the 25 Big Boggle cubes are given by the columns of the tables below. They are shown as lower-case here, but in the game they appear as capitals. The letter shown as 'q' is actually the digraph 'qu', and if used in a word counts as two letters:

aaaaacdddeeeeehh    aaaaaaaabccccddddeeefgino
abcfoieeieehilil    fdeaeaeajceeehhdhnmiiopoo
ebhfomilsgirormn    ieeageefkeiiillhhsoiprroo
ejoktolrthntstnn    rnefmegiqnlipnnnlstirrrtt
goppttrvtnsvstqr    snernemrxsplsoooosttsvrut
nosswuxyywuwtyuz    ynmsneuszttttrrtruttywywu

Here are the rules supplied with the game:

. Object: To list, within 3 minutes, as many words of the highest point value as you can find among the random assortment of letters in the cube grid. . Preparation: Each player should have a pencil and a piece of paper. Drop the letter cubes into the dome and place the grid, open side down, over the dome. Turn the domed grid right-side up, vigorously shake the cubes around, and maneuver the grid until each cube falls into place. Then, as one player removes the dome, another player starts the timer. . Playing: When the timer starts, each player searches the assortment of letters for words of four letters or more. When you find a word, write it down. . Words are formed from adjoining letters. Letters must join in the proper sequence to spell a word. They may join horizontally, vertically or diagonally to the left, right, or up-and-down. No letter cube, however, may be used more than once within a single word. . Type of words allowed: The only words that are allowed are those that can be found in a standard English dictionary. You may look for any type of word—noun, verb, adjective, adverb, etc. Plural nouns are acceptable as are all verb tenses. Words within words are also allowed, e.g., master: mast, aster. . Type of words not allowed: Proper nouns, abbreviations, contractions, hyphenated words, and foreign words that are not in an English dictionary. . Scoring and winning: When the timer runs out, everyone must stop writing. Each player in turn then reads aloud his or her list of words. Any word that appears on more than one player's list must be crossed off all lists, including that of the reader. The same word found by a player in different areas of the grid may not be counted for multiple credit. . After all players have read their lists, each player scores his or her remaining words, differing point values accorded to the words according to their lengths, as follows:

no. of letters 4 5 6 7 8+
points 1 2 3 5 11

The winner is a) the player whose words have earned the most points, or b) the first to reach 50 points, 100 points or whatever score is considered by all to be a reasonable target.

I usually play until at least one player has reached or passed 100 points. I've played the game with three or four players, but prefer the two-person game. My wife and I have established several additional house rules. Since I frequently wrote down spurious words, my wife insisted that there be a penalty for such. Thus, any word may be challenged. If it is not found in the dictionary the player is given a score of -1 for it. If it is in the dictionary, the player gets an additional point for it. We began by using our huge unabridged Merriam-Webster dictionary, but this, my wife claimed, gave me an unfair advantage, since I frequently wrote down archaic Scottish words and the like that were in this dictionary but not in smaller ones. We then began using the Concise Oxford Dictionary (COD), but had to give up on that, too, since it favoured English words and English spellings. My wife was tired of me putting down words like twee and nous that are unknown on our side of the Atlantic. We now use the American Heritage Dictionary (AHD), Ken Iverson's favourite. Here is a sample grid:

    t i n e
    n i n t
    o c n a
    r e t l

These letters are shown in normal position; in practice they can have any of the four possible orientations. Try your hand at finding words in this grid. Remember, you have just three minutes. You can see the words I found, unaided by computer, at the end of this paper.

The Problem

Twenty years ago, when I was Recreational APL editor of APL Quote Quad, I received a letter from Robert Ashworth, of Carbondale, Illinois, asking if I could write a column on the Boggle game. I was agreeable to the extent of posing these problems to my readers (this was before Big Boggle was born, so assumes the smaller Boggle situation):

1. Find the number of paths of length 4 in the grid. 1. Find the number of paths of length 5. 1. Write a suite of functions that, given a 4-by-4 character table, finds all words of length 4 and 5, using the rules of Boggle. Assume the existence of two tables w4 and w5 containing all the acceptable words of length 4 and 5, respectively.

I thought this was the most difficult problem I'd ever proposed. Furthermore, at the time I had only vague ideas of how to go about solving it. Recently, nagged by this unfinished business, I revisited the problem, with a degree of success.

A Boggle path is a sequence of distinct connected cells, connected in the Boggle sense. An interior cell is connected to the eight surrounding cells. An edge cell, not on a corner, is connected to the five surrounding cells. A corner cell is connected to the three surrounding cells. It may be suitable at times of to use the cell's list indices:

   ] m=: i. 4 4
 0  1  2  3
 4  5  6  7
 8  9 10 11
12 13 14 15

and at other time their row-column indices:

   ] q=: <"1 (4 4#: m)
+---+---+---+---+
|0 0|0 1|0 2|0 3|
+---+---+---+---+
|1 0|1 1|1 2|1 3|
+---+---+---+---+
|2 0|2 1|2 2|2 3|
+---+---+---+---+
|3 0|3 1|3 2|3 3|
+---+---+---+---+

Normalizing any 3-by-3 portion of the row-column index grid by subtracting the central item from each of the items, shows that two cells are connected if the maximum magnitude of the difference of their row-column indices is 1.

   (] -&.> (<1 1)"_ { ])3 _3{.q
+-----+----+----+
|_1 _1|_1 0|_1 1|
+-----+----+----+
|0 _1 |0 0 |0 1 |
+-----+----+----+
|1 _1 |1 0 |1 1 |
+-----+----+----+

The only way I know to find the number of paths for different cases is by constructing the paths. So to solve problems a and b above implies finding the paths themselves.

An easy but expensive way is to find a superset of the paths, by taking all the combinations of sixteen things taken four at a time, that is, 4!16 or 1,820, then get each of the 24 permutations of every combination, giving (!4)*(4!16) or 43,680 four-item lists, then select from the table all rows giving Boggle-connected paths. The verb comb is due to Roger Hui, and it and its component parts are given at the end of this paper. The function paths has syntax r =: n paths k and gives the paths of length n in a k-by-k grid.

paths=: dyad define

NB. find all paths of length x in a grid of size y * y

NB. px is a table of all the permutations of length x
      px=. (i. !x)A. i. x
NB. cxn is a table of all the combinations of y things taken x at a time
      cxn=. x comb *: y
NB. pc is a table of all the permutations of each of the combinations
      pc=: ,/px{"2 1 cxn
NB. (i{mpc) is 1 if (i{cxn) is Boggle-connected and 0 otherwise
      mpc=. y okf pc
NB. cxno is all the paths of length x in a y*y grid
      cxno=. /:~ mpc#pc
)

This is the okf function:

   okf=: 13 : '1=>./,|2-/\(2#x)#:y'"1

Given a list y, this takes the row-column representation ((2#x)#:y) of each item, the difference of pairs of successive representations (2-/\), their magnitudes (|), ravels these (,), finds their maximum (>./), compares this to 1 (1=), yielding 1 if the list is Boggle-connected, and 0 otherwise. For example, the number of paths of length 3 in a 4-by-4 grid is given by:

   #3 paths 4
408

Here are four successive items from pc after running #4 paths 4 :

   37 38 39 40{pc
2 0 4 1
2 1 0 4
2 1 4 0
2 4 0 1

And here is the result of applying 4 okf to each of these rows:

   4 okf 37 38 39 40{pc
0 1 1 0

If you look again at m you can verify that 2 0 4 1 and 2 4 0 1 are not Boggle-connected, but 2 1 0 4 and 2 1 4 0 are:

   m
 0  1  2  3
 4  5  6  7
 8  9 10 11
12 13 14 15

Don't bother to use this function. It takes an unbearably long time as the path lengths get just a little larger. It is intended only to let you know how my thinking was going.

An easier way to get the paths is to build them up starting with the n^2 paths of length 1, and extending these only with promising items. At this point I showed my astuteness by sending a message to Roger Hui explaining what I was doing, and asking him if anything bright in a combinatorial way occurred to him. I received, in rapid succession, four replies, written while he was babysitting his son Nicholas as he was in the middle of moving from Toronto to Vancouver. I'll pass over the first three messages because, as usual with Roger, one idea suggested a better idea. Here is his last effort:

rimb=: _1 ,. (_1 , ] , _1:) ,. _1:
tileb=: 3 3 &(,;._3)
nborsb=: (4 1 4#1 0 1)&#"1 @ (,/) @ tileb @ rimb @ i. @ ,~

initb=: ,. @ i. @ *:
extendb=: 8&#@] ,. [: , {:"1@] { [
testb=: *./"1@(0&<:) *. i.@{:@$ -:"1 i."1~
stepb=: (testb # ])@extendb
pathb=: 4 : '(nborsb x) stepb^:(y-1) initb x'

The pathb dyad has syntax z =: n pathb k , and yields a table with k columns, with rows giving all the paths of length k to be found in an n-by-n grid. It begins by using initb to form a seed table having all possible starting cells of paths, namely, a single column with values i. *: n . Each use of its step dyad, beginning with the seed table, extends its argument, a table of paths of length k-1, to form the table of paths of length k. The stepb dyad requires as its left argument a table with eight columns and as many rows as there are cells in the grid. Each row contains a list of all the neighbours of each cell. In the case of edge cells, which have only five neighbours, and corner cells, with only three, the lists are filled out with negative-one values. The nborsb monad builds this table by forming an n-by-n grid with (i.@,~) and using the rimb monad to border this with negative ones. This is tesselated into raveled 3-by-3 squares with the tileb monad. Ravel (,) and insert-ravel (,/) make the individual tables into one large table. To complete the table the central column is deleted with (4 1 4#1 0 1)&#"1 . Here is what this table for the 4-by-4 case looks like:

   nborsb 4
_1 _1 _1 _1  1 _1  4  5
_1 _1 _1  0  2  4  5  6
_1 _1 _1  1  3  5  6  7
_1 _1 _1  2 _1  6  7 _1
_1  0  1 _1  5 _1  8  9
 0  1  2  4  6  8  9 10
 1  2  3  5  7  9 10 11
 2  3 _1  6 _1 10 11 _1
_1  4  5 _1  9 _1 12 13
 4  5  6  8 10 12 13 14
 5  6  7  9 11 13 14 15
 6  7 _1 10 _1 14 15 _1
_1  8  9 _1 13 _1 _1 _1
 8  9 10 12 14 _1 _1 _1
 9 10 11 13 15 _1 _1 _1
10 11 _1 14 _1 _1 _1 _1

The table for the 5-by-5 case is similar, but with 25 rows instead of 16.

At each step, the last item of each of the current paths is used to select the appropriate row from the neighbours table, that row is replicated eight times, and each neighbour is appended to one of the eight replicated rows, using the extendb dyad. The testb monad removes illegal rows from this extended table, ensuring that no item is duplicated, and no negative ones are present in the result. This process continues for k-1 steps, at the end of which we have obtained the desired table.

The pathb function executes 4 pathb 4 in 0.02 seconds, and 4 pathb 5 in 0.09 seconds on my 233mhz computer. With its help I found the answers to problems a and b:

   #4 pathb 4
1764
   #4 pathb 5
6712

The time for my paths function to do 4 paths 4 is 43.4 seconds. I don't have the patience to try cases involving longer paths.

Now it's time to solve problem c. I have carried about with me for about 15 years word lists from the American Heritage Dictionary. The original data came from the publisher Houghton-Mifflin on a single file on a magnetic tape which Joey Tuttle had mounted on a tape drive in the Toronto machine room of I. P. Sharp Associates and read into the Amdahl V8 then in use there. He processed this file in several ways, the one of most use to me being sorted lists of words all of the same length, which I have named Words02 through Words26. These are now resident on a Macintosh computer in my home. With the tools you've seen developed, you could probably arrive at a solution yourself. Assume that you have b, the ravel of the letters in the grid, p the table of paths, and w the list of words, all for a given length. Then

   r=: /: ~ ~. (w e. p { b) # w

will give r, a sorted table of all the words of a given length, with no duplicates, satisfying a legal path on the grid. Before showing the results I obtained using the phrase above, I give first the words I found unaided by computer in this grid:

    t i n e
    n i n t
    o c n a
    r e t l
ante alter cental
cent inner lancer
coin inter lancet
core lance lanner
lane later recoin
late nance rennet
nice nicer rental
nine octal tanner
rent renal tannin
tine tinner
tint

All of these are in COD. All but "nance" and "recoin" are in AHD. And here are those found by the computer:

anne alter alnico
ante ancon cental
cent anent cetane
cero anion encina
coin conti encore
core creon innate
etna enate lancer
icon inane lancet
lane inner lanner
late inter latent
neon lance nocent
nero later octane
nice nicer octant
nine renal rectal
nore renan rennet
once renin rennin
reni rente rental
rent tater tanner
tate tinct tannic
tent tannin
tine tenant
tint tinner

The results of me versus the computer are: 4-letter words, 11 vs 16; 5-letter words, 9 vs 16; and 6-letter words, 10 vs 22. The second table shows a defect of my word collection. The tape we got from Houghton Mifflin includes biographical and geographical entries, and these have not been removed. I've put in italics those words, which, by Boggle rules, are not legal words. Also, COD has some words not in AHD and vice-versa. For example, "cero" (a western Atlantic fish) is in AHD, but not in COD, and "recoin" is in COD but not AHD.

There are longer words in the list. How many of you found "continent"? And "continental"? And the 16-letter behemoth "intercontinental"? Of course, I contrived this case, choosing a suitable looking word from my word16 file, in order to fill a 4-by-4 grid completely.

Here is an incomplete table of the number of paths of given lengths in grids from size two to five:

paths =: 0 : 0
The number of paths of length k connecting distinct
adjacent points in a square grid of n * n points:
k\n    2      3       4       5
1      4      9      16      25
2     12     40      84     144
3     24    160     408     768
4     24    496    1764    3768
5          1208    6712   17280
6          2240   22672   74072
7          2984   68272  296360
8          2384  183472
9           784  436984
)

I've been unable to find the law of this table. Row 1 are the squares, of course. Row 2 is four times the alternate triangular numbers (3 10 21 36). Beyond that deponent witnesseth not. The columns headed 2 and 3 are complete, but even with Hui's much more efficient functions, larger cases for columns 4 and 5 are still unattainable with my computer (and my patience).

There are two ways the task can be made more efficient: first, instead of using pathb, which necessitates going back to the beginning for each case computed for a given grid size, the previous results can be stored, allowing case k to be computed by using the stepb function on the k-1 result; and second, by taking advantage of the symmetries of the square grid. The number of paths of a given length proceeding from any corner point are the same; similarly for symmetrically located edge points and interior points. The symmetries are evident if the number of paths beginning from each grid point are displayed. For example, here are the number of paths from each point for a 4-by-4 case and a 5-by-5 case:

   4 4$#/.~{.|:4 pathb 5
322 435 435 322
435 486 486 435
435 486 486 435
322 435 435 322

   5 5$#/.~{.|:5 pathb 6
1874 2752 2998 2752 1874
2752 3524 3672 3524 2752
2998 3672 3784 3672 2998
2752 3524 3672 3524 2752
1874 2752 2998 2752 1874

In general, for a grid of side n, containing n^2 points, there are only g(n) distinct numbers of paths, where g is

   g=: 2&!@>:@>.@-:

Here are the results of g applied to grid sizes one through 8:

   (,:g) 1+i.8
1 2 3 4 5 6  7  8
1 1 3 3 6 6 10 10

So that, for the 4-by-4 case, only three values have to be computed, not 16, and for the 5-by-5 case, only six, not 25. The total number of cases for a given path length k in a grid of size n can be obtained by an inner product:

   4 4$#/.~{.|:4 pathb 4
 75 109 109  75
109 148 148 109
109 148 148 109
 75 109 109  75

   4 8 4+/ . * 75 109 148
1764

   #4 pathb 4
1764

When the actual paths are needed, the additional cases can be obtained from the abbreviated tables by using the appropriate indices to select from them, the indices being the permutations obtained from the ravels of the rotations and reversals of the table i.(n,n) . Here is how this is done:

NB. get neighbours table
   n4=: nborsb 4
NB. form monad to step previous case by bonding stepb's left argument
   f=: n4&stepb
NB. initial case for abbreviated situation
   ] ip=: ,.0 1 5
0
1
5
NB. Get all paths for abbreviated argument
   $p42=: f ip
16 2
NB. Only sixteen cases of the 84 obtained using full argument
   $4 pathb 2
84 2
NB. Use phrase 7.b.d8 to obtain desired permuted lists of all cell numbers
   ] allpcn=: ,"2 (i.8)d8"0 2 i.4 4
 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
12  8  4  0 13  9  5  1 14 10  6  2 15 11  7  3
15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
 3  7 11 15  2  6 10 14  1  5  9 13  0  4  8 12
12 13 14 15  8  9 10 11  4  5  6  7  0  1  2  3
15 11  7  3 14 10  6  2 13  9  5  1 12  8  4  0
 3  2  1  0  7  6  5  4 11 10  9  8 15 14 13 12
 0  4  8 12  1  5  9 13  2  6 10 14  3  7 11 15
NB. Get all length-2 paths (with possible duplicates)
   $q42=: ,/p42{"1/allpcn
128 2
NB. Get ordered set of all length-2 paths
   $s42=: /:~ ~. q42
84 2
NB. compare ordered set to long-winded but accurate set
   (4 pathb 2) -: s42
1

Appendix

Hui's combinations suite:

startc=: i.@-.@-
countc=: <:@[ ! <:@[ + |.@startc
indexc=: ;@:((i.-])&.>)
recurc=: (countc#startc) ,. (indexc@countc{comb&.<:)
testc=: *@[ *. <
basisc=: i.@(<: , [)
comb=: basisc`recurc @. testc

Rotation/reflection suite (from J Phrases, 7.B.d8):

m0=: ]			NB. Identity
m1=: m6@m7		NB. Three-o’clock rotation
m2=: m4@m6 		NB. Six-o’clock rotation
m3=: m4@m7 		NB. Nine-o’clock rotation
m4=: |.@] 		NB. Horizontal reflection
m5=: m2@m7 		NB. Counterdiagonal reflection
m6=: |."_1@]  		NB. Vertical reflection
m7=: |:@] 		NB. Diagonal reflection
d8=: m0`m1`m2`m3`m4`m5`m6`m7 @. [
NB. i d8 y gives mi y (all rotates and reflects)