|
Volume Number: 14 (1998)
Issue Number: 8
Column Tag: Programmer's Challenge
August 1998 Programmers Challenge
by by Bob Boonstra, Westford, MA
Block Buster
One of the Vice Presidents at the Real Job (we'll call him Don) has a collection of unusual puzzles at his conference table. The kind that require you to move a large block through a small hole, untangle a set of intertwined metal loops, do the impossible with rope, or otherwise do something that requires mastery of the fourth spatial dimension or a greater aptitude toward right-brain thinking than I possess. Don retired this week, and, in his honor, this month's Challenge is devoted to a spatial reasoning exercise. Not a puzzle that would have been worthy of his conference table, perhaps, but something in that same spirit.
Our puzzle is a variation of the Soma Cube, reportedly conceived by a writer from Denmark named Peter Hein as he was listening to a lecture on quantum physics. (That's one class I don't remember being able to daydream through!) The object of the Soma Cube is to form a 3x3 cube using seven shapes formed from three or four of the smaller cubes:
Of course, the Soma can be assembled into other shapes besides a cube, which forms the basis for our Challenge. Your job will be to write code that will assemble a larger number of potentially larger shapes into an even larger designated shape.
The prototype for the code you should write is:
#if defined(__cplusplus) extern "C" { #endif #define kUnknown 0xFFFF typedef struct Cubie { SInt16 xCoord; /* x coordinate of cubie */ SInt16 yCoord; /* y coordinate of cubie */ SInt16 zCoord; /* z coordinate of cubie */ UInt16 value; /* ordinal value assigned to the Piece this cubie is part of */ } Cubie; typedef struct Piece { UInt32 numCubies; /* number of individual cubies in this piece */ Cubie **theCubies; /* pointer to array of cubies in this piece */ } Piece; Boolean BlockBuster( /* return TRUE if you were able to solve the puzzle */ long numPieces, /* number of pieces to assemble */ Piece thePieces[], /* the pieces to assemble */ Piece theGoal /* the structure you should assemble */ /* set (*theGoal.theCubies)[i].value to ((*thePiece->theCubies)+j)->value if piece j occupies cubie i in the reassembled puzzle */ ); #if defined(__cplusplus) extern "C" { #endif
The building block for our puzzle is the Piece, which is formed from numCubies individual cubes, provided to you in an array pointed to by theCubies. Each Piece has a unique nonzero identifier, which is associated with the value field in the Cubie structure. Cubies also have x, y, and z coordinates that define their relative orientation toward one another. There are no restrictions on the size or shape of a Piece, except that the constituent Cubies will all be connected (i.e., adjacent to another Cubie in the same Piece in x, y, or z.
Your BlockBuster routine is given an array (thePieces) of numPieces Pieces to work with. It is also provided with theGoal, an assembly of Cubies that you are to create from thePieces. For convenience, theGoal is also described using the Piece data structure. On input, the occupied coordinates are assigned a value of kUnknown. On output you should replace that value with the value of the Piece that occupies that coordinate. You may rotate and translate thePieces as you desire, but you may not break them by changing the relative orientation of the Cubies. You may use each Piece only once in assembling theGoal shape. All puzzles will be solvable, but if you feel you cannot solve a puzzle, BlockBuster should return FALSE.
As an example, the Pieces in the standard Soma Cube might be described as follows (with each 4-tuple representing the x, y, z, and value components of a cubie:
{0,0,1, 5}, {1,0,1, 5}, {0,1,1, 5}, {0,0,0, 5}, {0,0,0, 1}, {1,0,0, 1}, {0,1,0, 1}, {0,0,0, 3}, {1,0,0, 3}, {2,0,0, 3}, {0,1,0, 3}, {0,0,0, 7}, {1,1,1, 7}, {0,1,0, 7}, {0,1,1, 7}, {0,0,0, 6}, {1,0,0, 6}, {0,1,0, 6}, {0,1,1, 6}, {0,0,0, 4}, {1,0,0, 4}, {2,1,0, 4}, {1,1,0, 4}, {0,0,0, 2}, {1,0,0, 2}, {2,0,0, 2}, {1,1,0, 2},
The winner will be the solution that assembles theGoal shape in the minimum amount of time. There are no storage constraints for this Challenge, except that it must execute on my 96MB 8500/200.
This will be a native PowerPC Challenge, using the latest CodeWarrior environment. Solutions may be coded in C, C++, or Pascal.
Three Months Ago Winner
Sometimes Challenge solutions are very close to one another in performance, and other times the margin of victory is overwhelming. This month the winner falls into the latter category. Congratulations to Ernst Munter (Kanata, Ontario) for submitting an entry that soundly defeated the other four Challengers in the May Boggle contest. The objective was to score the most points by rearranging a Boggle® puzzle to form high scoring words, while at the same time minimizing the time penalty imposed for execution time. While Ernst took an order of magnitude more time to generate a solution, he found nearly four times as many unique words as his closest competitor, resulting in by far the highest score.
As you might suspect by looking at relative execution times, Ernst's strategy was a comparatively sophisticated one that he calls "simulated annealing". He begins by eliminating from the dictionary all words that cannot be formed by the letters available in the given instance of the puzzle. He then computes the frequency with which possible letter pairs occur in the dictionary, and uses this computation to exhaustively rearrange the puzzle letters and maximize what he calls "energy", the sum of the weights of the letter pairs in the rearranged puzzle. Only then does Ernst identify which dictionary words are present in the puzzle and compute a score. Optimization is done by randomly exchanging letter pairs until the expected gain for continuing exceeds the estimated penalty for continuing, after which the solution terminates.
One of the other solutions started by finding the longest (and therefore highest scoring) word in the dictionary that could be formed by the given letters and rearranging the puzzle to form that word. Another solution divided the board squares into three categories: corners, edges, and middle squares. It then rearranged the board so that "preferred" letters, determined a priori, were moved to the middle, and unfavored letters were moved to the corners.
There were 24 test cases used to evaluate this Challenge, 6 cases for each puzzle size (4, 5, 6, and 7). The table below lists, for each of the five solutions submitted, the cumulative number of words found, the number of those words that were unique, the associated points earned for the unique words, the time penalty (in points), and the net score. It also includes code size, data size, and programming language used. As usual, the number in parentheses after the entrant's name is the total number of Challenge points earned in all Challenges to date prior to this one.
Name | Lang | Total Words | Unique Words | Points | Time Penalty | Score | Code | Data |
Ernst Munter (364) | C++ | 11560 | 11560 | 39932 | 5191 | 34741 | 5840 | 8 |
JG Heithcock (10) | C | 5055 | 3037 | 3823 | 7 | 3816 | 1688 | 148 |
Tim Gogolin | C | 2368 | 3275 | 3275 | 47 | 3228 | 3456 | 412 |
Randy Boring (77) | C | 3154 | 1976 | 2828 | 4 | 2824 | 2836 | 944 |
Eric Hangstefer (9) | C | 1858 | 1858 | 2594 | 477 | 2117 | 3832 | 68 |
Top Contestants
Here are the Top Contestants for the Programmer's Challenge, including everyone who has accumulated more than 10 points during the past two years. The numbers below include points awarded over the 24 most recent contests, including points earned by this month's entrants.
- Munter, Ernst 210
- Boring, Randy 74
- Cooper, Greg 54
- Mallett, Jeff 50
- Rieken, Willeke 47
- Nicolle, Ludovic 34
- Lewis, Peter 31
- Maurer, Sebastian 30
- Gregg, Xan 24
- Murphy, ACC 24
- Hart, Alan 21
- Antoniewicz, Andy 20
- Day, Mark 20
- Heithcock, JG 20
- Hostetter, Mat 20
- Studer, Thomas 20
There are three ways to earn points: (1) scoring in the top 5 of any Challenge, (2) being the first person to find a bug in a published winning solution or, (3) being the first person to suggest a Challenge that I use. The points you can win are:
- 1st place 20 points
- 2nd place 10 points
- 3rd place 7 points
- 4th place 4 points
- 5th place 2 points
- Finding bug 2 points
- Suggesting Challenge 2 points
Here is Ernst's winning solution to the Boggle Challenge:
Boggle.Cp
Copyright © 1998, Ernst Munter
The Problem
A boggle board (Boggle®, a Parker Brothers game) shows a 2-D array of letters. The task is to find as many words as possible which can be traced on the puzzle, by going from letter to adjacent letter without using the same letter twice. Actually, it is the point count that should be maximized, where longer words count significantly higher than short words.
As a variant of the original game, an extra degree of freedom is available: the board may be rearranged to maximize the point count.
Solution Strategy
Finding all words that can be made on the given puzzle is fairly straight forward, with a scan of the dictionary. The real challenge is to find a way to rearrange the board efficiently. This seems to be a very hard problem to do exactly, that is to find the absolutely best arrangement.
I settled for a Monte Carlo approach, and try random shuffles until a good score is reached. The result is likely far from the optimum.
My strategy has a number of components.
First, I extract from the dictionary all candidate words for which the letters in the puzzle suffice, regardless of position.
Then I try to pre-arrange the puzzle into a state where letters that are adjacent in the candidate words are as much as possible adjacent also in the puzzle. To this end, I use a notion of "energy": Each possible 2-letter combination is given an energy value, equal to the sum of all candidate word values which contain this letter pair.
We can now compute a total energy for the boggle board which is maximized by a greedy algorithm as follows:
- The energy of each letter position is computed as the sum of the energies of all pairings with the adjacent letters.
- At each step, all letter pair exchanges are evaluated to see which would result in the largest increase in puzzle energy.
- If no increase is possible we are done.
- The selected pair is exchanged, and the previous steps are repeated.
The next phase is to evaluate the puzzle by finding all (candidate) words that match the board in its current state.
The score is then hopefully increased by a random search:
- One letter pair is exchanged at random,
- The puzzle is re-evaluated,
- The configuration with the higher score is kept.
- Try again, until we run out of time.
Time Management
The point score will be diminished by 1 point for each 20msec of running time. Rather than run for an arbitrary amount of time, or an fixed number of random swaps, an adaptive algorithm is used:
The elapsed time is measured after each run of random swaps, to establish an average cost for each swap, which includes the time to evaluate its effect on points.
The first run is for a fixed number of 32 swaps, to establish a baseline. Subsequent runs are limited in length by the following consideration. Using the known average time per run, I can fairly accurately predict the number of points to be lost by a run of N swaps. And I might gain some speculative number of points due to one or more lucky swaps.
As long as the expected gain exceeds the expected risk, we may continue to try random swaps. But it might be wise to limit the potential loss in each run to a small number such as 5 or 10% of the total points. Larger boards result in larger variations, and I take a larger risk, hopefully for a larger gain.
Assumptions
Memory required is for the Puzzle Structure (about 7KB) plus 12 bytes per "candidate", that is words which potentially fit the current puzzle.
No assumptions are made of persistence of the private storage between calls. Each call is handled completely independently. One could remember the puzzle size, and if the same, avoid recomputing the puzzle Neighborhood structure, but this is hardly worth the trouble.
#include "boggle.h" #include <string.h> #include <timer.h> enum { kMinLen=3, // no points for words < kMinLen kMaxDim=7, kMaxDimSquare=kMaxDim*kMaxDim, kHuge=127}; enum { kPenaltyRate=20000, // 20msec per point kFirstRun=32, kDefRiskFactor=13}; // 12 or 13 seem to work best struct Timer { // microseconds system timer int T0; void StartTimer() { UnsignedWide UWT; Microseconds(&UWT); T0=UWT.lo; } int ReadTimer(){ UnsignedWide UWT; Microseconds(&UWT); return UWT.lo-T0; } }; GetProfile // A word profile is a 32-bit bit map of letters used in word static unsigned long GetProfile(const char* word,int num) { unsigned long p=0; for (int i=0;i<num;i++) { p |= 1<<(*word++ & 31); } return p; } static unsigned long GetProfile(const char* word) { int c; unsigned long p=0; while (0 != (c=*word++)) { p |= 1<<(c & 31); } return p; } Candidate struct Candidate { // A dictionary word that could fit const char* word; unsigned long profile; unsigned short value; unsigned char fit; unsigned char tempFit; Candidate(){}; Candidate(int len,const char* wd){ word=wd; profile=GetProfile(wd); switch(len){ case 0: case 1: case 2: value=0;break; case 3: case 4: value=1;break; case 5: value=2;break; case 6: value=3;break; default:value=5+6*(len-7); } fit=tempFit=0; } }; Neighbors struct Neighbors { char* next[9]; // pointers to the 8 neighbors of }; // a puzzle field, plus a NULL enum {MIDDLE,EDGE,CORNER}; struct Neighborhood { Neighbors nb[kMaxDimSquare]; // 1 set per puzzle field void Init(int dim,char* c) { int dimSq=dim*dim; int col,row,x=0; Setup(x++,c++,1,dim,CORNER); for (col=1;col<dim-1;col++) Setup(x++,c++,1,dim,EDGE); Setup(x++,c++,-1,dim,CORNER); for (row=1;row<dim-1;row++) { Setup(x++,c++,dim,1,EDGE); for (col=1;col<dim-1;col++) Setup(x++,c++,1,dim,MIDDLE); Setup(x++,c++,dim,-1,EDGE); } Setup(x++,c++,1,-dim,CORNER); for (col=1;col<dim-1;col++) Setup(x++,c++,1,-dim,EDGE); Setup(x++,c++,-1,-dim,CORNER); } void Setup(int from,char* c,int A,int B,int type) { char** np=nb[from].next; switch (type) { case MIDDLE: *np++=c-A-B; *np++=c-B; *np++=c+A-B; case EDGE: *np++=c-A; *np++=c-A+B; case CORNER: *np++=c+A; *np++=c+B; *np++=c+A+B; *np=0; } } }; Tour struct Tour { // Sequence of puzzle locations char* tour[1+kMaxDimSquare]; void Clear(){tour[0]=0;} void Add(char* loc) { char** tp=tour; while (*tp) tp++; *tp=loc; *++tp=0; } void Replace(char* loc1,char* loc2) { char** tp=tour; while (*tp) { if (*tp==loc1) { *tp=loc2; return; } tp++; } return; } }; Stack struct Stack { // used to remember the track when char** CP; // tracing a word through the puzzle char val; void Push(char** c,char v){CP=c;val=v;} void Pop(char** & c,char & v){c=CP;v=val;} }; Puzzle struct Puzzle { int dimension; // 4 to 7 int dimSquare; // 16 to 49 int points; // computed points int riskFactor; unsigned long profile; // of letters in puzzle Stack* SP; char* P; // ref to the 'puzzle' Timer tmr; char letterCount[32]; int energy[kMaxDimSquare]; // energy of puzzle fields Stack stack[kMaxDimSquare]; Tour TOUR[32]; // list of puzzle fields Neighborhood N; int pairs[32*32]; // values of letter pairs Puzzle::Init void Init(int dim,char* puzzle) { dimension=dim; dimSquare=dim*dim; points=0; SP=stack; P=puzzle; memset(letterCount,0,32); for (int i=0;i<32;i++) TOUR[i].Clear(); for (int i=0;i<dimSquare;i++) { char c=P[i]; letterCount[c&31]++; TOUR[c&31].Add(P+i); } profile=GetProfile(P,dimSquare); N.Init(dimension,P); riskFactor=kDefRiskFactor-dim; // more risk for larger puzzles } Puzzle::Trace int Trace(Candidate* cd) { // trace 1 word const char* wd=cd->word; char** tour=TOUR[*wd & 31].tour; char* loc=*tour; char letter; // loop unrolled for the first and second letters of the word first: letter=*loc; if (letter=='Q') ++wd; //skip 'U' after 'Q' // deal with the first and second letters SP++->Push(tour,letter); *loc=0; //hide 1st letter tour=N.nb[loc-P].next; //goto 2nd tour loc=*tour; ++wd; second: letter=*loc; if (letter==*wd) { if (letter=='Q') ++wd; //skip 'U' after 'Q' if (wd[1]==0) goto success; SP++->Push(tour,letter); *loc=0; //hide 2nd letter tour=N.nb[loc-P].next; loc=*tour; ++wd; goto third; //goto 3rd tour } else { loc=*++tour; // next 2nd loc if (loc) goto second; (--SP)->Pop(tour,letter); **tour=letter; //restore first letter if (letter=='Q') --wd; --wd; loc=*++tour; if (loc) goto first; } return 0; //fail third: do { letter=*loc; if (letter==*wd) { if (letter=='Q') ++wd; //skip 'U' after 'Q' if (wd[1]==0) break; //success SP++->Push(tour,letter); *loc=0; //hide this letter tour=N.nb[loc-P].next; loc=*tour; ++wd; } else do { loc=*++tour; if (loc==0) { if (SP<=stack) { return 0; } (--SP)->Pop(tour,letter); **tour=letter; //restore last hidden letter if (letter=='Q') --wd; --wd; } else break; } while(1); } while(1); success: while (SP>stack) { // restore all letters (--SP)->Pop(tour,letter); **tour=letter; loc=*tour; } return 1; } Puzzle::IsPossible int IsPossible(const char* word) {// all letters present int c,len=0; char myDist[32]; memset(myDist,0,32); while (0 != (c=31 & *word++)) { if (letterCount[c]>myDist[c]) { myDist[c]++; len++; } else return 0; if ((c==(31 & 'Q')) && (*word)) word++; // don't need U after Q } return len; } Puzzle::Swap void Swap(const int a,const int b){ char ca=P[a],cb=P[b]; TOUR[ca&31].Replace(P+a,P+b); TOUR[cb&31].Replace(P+b,P+a); P[a]=cb; P[b]=ca; } // The next set of functions are for points maximization Puzzle::RandomShake // RandomShake just swaps two letters in the puzzle, // making sure they are different, and returns the // profile of the two letters unsigned long RandomShake(int & a,int & b) { char ca,cb; do { a=rand()%dimSquare; b=rand()%dimSquare; } while ((a==b) || ((ca=P[a])==(cb=P[b]))); TOUR[ca&31].Replace(P+a,P+b); TOUR[cb&31].Replace(P+b,P+a); P[a]=cb; P[b]=ca; unsigned long shakeProfile=(1L<<(ca&31)) | (1L<<(cb&31)); return shakeProfile; } Puzzle::Evaluate // The function Evaluate checks all candidates for fit, // and computes the total point score for the puzzle void Evaluate(Candidate* cd,int numCandidates) { int pts=0; for (int i=0;i<numCandidates;i++) { if (0 != (cd->fit=Trace(cd))) { pts+=cd->value; } cd++; } points=pts; } Puzzle::Improves // Improves() answers the question: would the proposed shakup // increase the score. Computes candidates.tempFit. // Uses shakeProfile to avoid recomputing candidates that // are unaffected by the swap. bool Improves( unsigned long shakeProfile, Candidate* cd,int numCandidates) { int pts=points; for (int i=0;i<numCandidates;i++) { if ((shakeProfile & cd->profile)) { cd->tempFit=Trace(cd); pts+=(cd->tempFit-cd->fit)*cd->value; } else { cd->tempFit=cd->fit; } cd++; } if (pts>points) { points=pts; return true; } return false; } Puzzle::UpdateFits // UpdateFits commits tempFit to fit, after a swap is accepted void UpdateFits(Candidate* cd,int numCandidates) { for (int i=0;i<numCandidates;i++) { cd->fit=cd->tempFit; cd++; } } Puzzle::Rearrange // Rearrange shakes the puzzle a specified number of times // and retains the configuration yielding the highest score void Rearrange(int howOften,Candidate* cd,int numCandidates) { for (int i=0;i<howOften;i++) { int s1,s2; unsigned long shakeProfile=RandomShake(s1,s2); if (Improves(shakeProfile,cd,numCandidates)) { UpdateFits(cd,numCandidates); } else { Swap(s1,s2); } } } Puzzle::Optimize // Optimize calls Rearrange a number of times, to strike a // compromise between points gained by random shakes, and // points lost due to runtime cost. void Optimize(Candidate* cd,int numCandidates) { tmr.StartTimer(); int run=kFirstRun; int countRuns=0; int runningRate; int oldPoints=points; int basePoints=points; Rearrange(run,cd,numCandidates); int runTime=tmr.ReadTimer(); int lost=(runTime+kPenaltyRate/2)/kPenaltyRate; do { oldPoints=points; countRuns+=run; runningRate=runTime/countRuns; int pointsToRisk=points/riskFactor-lost; int run=kPenaltyRate*pointsToRisk/runningRate; if (run<1) break; Rearrange(run,cd,numCandidates); runTime=tmr.ReadTimer(); lost=(runTime+kPenaltyRate/2)/kPenaltyRate; } while (points > oldPoints); } // The next set of functions relate to energy maximization Puzzle::PrimePairs // PrimePairs uses letter pairs in all candidate words // to construct an array of basic energy values for each. void PrimePairs(Candidate* cd,int numCandidates) { memset(pairs,0,sizeof(pairs)); for (int i=0;i<numCandidates;i++) { const char* wd=cd->word; int a=*wd & 31,b; do { b=*++wd & 31; if (b==0) break; pairs[32*a+b]=pairs[a+32*b]+=cd->value; a=b; } while(1); cd++; } } Puzzle::InitEnergy // Computes the energy of each puzzle field void InitEnergy() { for (int i=0;i<dimSquare;i++) energy[i]=GetEnergy(i); } Puzzle::GetEnergy // Get the energy for one field, using the Neighborhood // struct to trace the field's neighbors. int GetEnergy(int k) { char** tour=N.nb[k].next; char* loc; int* pairP=pairs+32*(P[k]&31); int e=0; while (0 != (loc=*tour++)) { e+=pairP[*loc&31]; } return e; } Puzzle::CommitSwap // This swap did some good, let's keep it void CommitSwap(int a,int b){ Swap(a,b); energy[a]=GetEnergy(a); energy[b]=GetEnergy(b); } Puzzle::ComputeDelta // ComputeDelta makes a tentative swap and returns // the increase or decrease in energy. int ComputeDelta(int a,int b) { if ((a==b)||(P[a]==P[b])) return 0; Swap(a,b); int ea=GetEnergy(a); int eb=GetEnergy(b); Swap(a,b); // restore previous state return ea+eb-energy[a]-energy[b]; } Puzzle::ComputeBestSwap // ComputeBestSwap tries all possible swaps and returns // the swap yielding the highest energy increase int ComputeBestSwap(int & aa,int & bb) { int bestDelta=-10000; for (int a=1;a<dimSquare;a++) { for (int b=0;b<a;b++) { if (P[a]^P[b]) { int delta=ComputeDelta(a,b); if (delta>bestDelta) { bestDelta=delta;aa=a;bb=b; } } } } return bestDelta; } Puzzle::MaximizeEnergy // MaximizeEnergy keeps swapping puzzle letters until no // more increase in puzzle energy can be obtained. void MaximizeEnergy() { int a,b,delta=ComputeBestSwap(a,b); do { if (delta>0) CommitSwap(a,b); else break; delta=ComputeBestSwap(a,b); } while (1); } }; PrivateData struct PrivateData { // just the puzzle and the candidate words Puzzle pzl; int numCandidates; Candidate candidates[1]; // open ended array void Init(int dimension,char* puzzle) { pzl.Init(dimension,puzzle); } void SelectCandidates(int dictSize,const char *dictionary[]) { numCandidates=0; for (int i=0;i<dictSize;i++) { const char* word=dictionary[i]; int len=pzl.IsPossible(word); if (len>=kMinLen) candidates[numCandidates++]=Candidate(len,word); } pzl.PrimePairs(candidates,numCandidates); } void Solve() { pzl.InitEnergy(); pzl.MaximizeEnergy(); pzl.Evaluate(candidates,numCandidates); pzl.Optimize(candidates,numCandidates); } int CopyResults(const char *wordsFound[]) { int n=0; Candidate* cd=candidates; for (int i=0;i<numCandidates;i++) { wordsFound[n]=cd->word; n+=cd->fit; cd++; } return n; } }; Boggle // the function Boggle is published in boggle.h long Boggle( long dimension, char puzzle[], long dictSize, const char *dictionary[], const char *wordsFound[], void *privStorage ) { int nFound=0; srand(10001); // just to make it repeatable if ((unsigned long)dimension <= kMaxDim) { PrivateData* PD=(PrivateData*)privStorage; // assumed to be enough PD->Init(dimension,puzzle); PD->SelectCandidates(dictSize,dictionary); PD->Solve(); nFound=PD->CopyResults(wordsFound); } return nFound; }
- SPREAD THE WORD:
- Slashdot
- Digg
- Del.icio.us
- Newsvine