WilfÆs Workshop

Wilf Hey continues his workshop State Maps.

Calculating Best Strategies

Here we continue looking at the game of TIP, and how we can use the State Map to determine the strategy for winning this game. In the magazine this month we worked out the best TIP to make from each of the six states, depending on the Raw Product Total, when that Total was six or less. In many of the cases, the best strategy was to tip the die so that its top face was now the same as the Total. So if the Total were six, and the die had two on its top, the best move was to tip it to six; this reduces the Total to zero, and the game is won.

There are a few complications, though. When the Total is one, we cannot tip the die to one if the state is either one or six; similarly, there is no direct solution when the Total is two and the state is either two or five. The same goes for Total 3 and states 3 and 4; Total 4 and states 3 and 4; Total 5 and states 2 and 5; Total 6 and states 1 and 6.

However we discovered a secondary strategy: in these unhelpful circumstances, tip the die into a state (and reduced Total) that makes a strategy impossible to formulate for the other player. We cannot do anything if the Total is 1 and the state is either 1 or 6, but if the Total is 2 and the state is 2 or 5, we can still force a win by tipping the die into state 1. That would leave the next player with Total 1 and state 1, and we know he cannot do anything with that. Similarly when the Total is 6 and the state is 1 or 6: with states 2 through 5, a quick tip to show 6 face up wins immediately. But with state 1 or 6, we cannot tip to 6; but we can tip to 3 - which reduces the Total to 3 (6 - 3 = 3), and we know that Total 3, State 3 is a losing position.

When calculating with higher Totals, we can never tip to get an instant win. Instead we must tip to put the opponent in a losing position. For example, if the current Total is 7, a tip of 2 will put the game into Total 5, State 2 - which we have already determined is a loss. But a tip of 3 will put the game into Total 4, State 3 - also a losing situation. A tip of 4 will put the game into Total 3, State 4 as well - another losing position. A tip of 6 reduces the game to Total 1, state 6, racking up another loss for the opponent. So from Total 7, a tip of 6 is the fastest way to victory - but you cannot tip to 6 from state 1 or 6. From these two states, the quickest way to win from Total 7 is to tip to 4 - but even a tip to 3 or 2 will eventually win.

Say we want to kill the suspense: when we play to win, we will make the strategy that leads to the fastest sure win. On that basis, the chart of calculated best moves will have, for the line representing Total 7, the line 4,6,6,6,6,4. This means, when we look up the chart, that if the Total is 7, 4 is the best tip when the state is 1 or 6, and 6 is the best tip when the state is otherwise. None of these leads to an "L" (a zero in the chart), so we know that if we continue with the strategy, we will definitely win.

If we put this analysis into algorithm form, we find that the way to calculate best strategy for each of the six states associated with a particular Total is to investigate the values already found for best strategy in a diagonal going backward. Where T = Total and the results are in an array called BestStrategy(Total, State), we want to look at the six values in BestStrategy(Total - x, x) where x is 1 through 6. The new value for BestStrategy(Total, any state) is the highest x for which the value found is 0 (a losing position for the opponent). For two of the states, that highest value of x is inaccessible, so you need to find the highest accessible x where the value is 0 as well (if it exists).




To calculate the best strategy for tipping the die at each particular Total, we build up the results by looking at the diagonal representing the Total and State into which the game will be placed. When the Total is 16, a tip to 1 will place the game into Total 15, State 1, where the best strategy for the opponent would be to tip to 3. Similarly a tip to 2 will place the game into Total 14, State 2, where the opponent cannot avoid a loss (if we continue our strategy).




In the chart the best strategy for Total 16 is normally 4, because Total 12, State 4 is the position for highest x (=4) where it leaves a losing position for the opponent. But you cannot tip to 4 if the State is already 3 or 4 so for those two cases we need a second result. Tipping to 3 is good, but unfortunately that is not accessible from States 3 and 4 either. The next best is a tip to 2: it also results in zero (a loss for the opponent) so our calculated line becomes 4, 4, 2, 2, 4, 4.

Note that if there is no value 0 on the diagonal, no tip of the die can put the opponent in a losing position: if he plays best strategy, he cannot lose. In this case, you will have to score the value of this position as a loss yourself (=0). When you go ahead and calculate the best strategy when Total is 17, you will find there is only one winning strategy: tip to 4 (Total 13, State 4 is an "L"). But you cannot tip to 4 from states 3 and 4. So, reluctantly, you will find that the calculated line for best strategy when Total = 17 is 4, 4, 0, 0, 4, 4.




This is a simple version of the State Map for tipping a die. Note that when '1' is on the top of the die, you can tip it to '2', '3', '4', or '5' but not to '6'. Similarly for each face, you can tip the die to any one of four others. Note that this map has six nodes and nine edges connecting them.




Simplifying the State Map

Let's take a closer look at the State Map. In the form we have given it, it lays flat as on a piece of paper, with no lines crossing each other. Of course, though a Tree can always be drawn that way, that is not always so of other networks. Freed from any artificial restraint, let's try a little experiment with this map. We know that nodes1 and 6 are similar to each other; they are not connected to each other, but each is connected to the same set of four other nodes. The same situation makes a pairing of nodes 2 and 5, and of nodes 3 and 4.




This State Map looks more complicated, but it is the earlier one with the nodes '1', '2' and '3' placed near the nodes '6', '5' and '4' respectively, instead of rendered flat. Note the pattern of connecting edges between the nodes.




If we use three dimensions, and move node 1 so that it is directly over node 6 (and node 2 over node 5, and node 3 over node 4) we can see this relationship easily. In fact, it becomes evident that aside from the number on the die's face, node 1 and node 6 are exactly the same: it could be said that the 'state 1,6' is a single node, connected to two other nodes, 'state 2,5' and 'state 3,4'.




This is a specialised State Map, where nodes '1' and '6' are treated like a single node, and likewise for '2' and '5', and for '3' and '4'. The connecting edges are simpler as well: from the joint State '1,6' you can tip the die to either joint State '2,5' or joint State '3,4'. This map has three nodes and three edges, but correctly summarises all the State Transitions in the original, which had six nodes and nine edges.




Now we can condense these doubled nodes, and end up with a simplified (though specialised) State Map. The new State Map substitutes six nodes and nine edges with just three nodes, and only three edges joining them. This model doesn't tell the whole story: it fails to note whether there is a 6 or a 1 on the top of the die, and whether it is subsequently tipped to 3, or tipped to 4. But it doesn't matter: the transition from state to state in each of these four cases is no different, and it is the state that determines whether the game is won or lost.

Admittedly this is a bit of a simplification - but do we really need it? Well, yes, it may be important for storing a large 'best strategy' array. Earlier I suggested an array BestStrategy(Total, State), where State is from 1 to 6 and Total is from zero on upward to whatever we want. But what if we want Total as 3000? That's 18000 numbers to calculate: the computer's power is no problem, but memory storage may be a bottleneck. If you could group the states together to form one big number you can get a reduction: then a single quantity for BestStrategy(Total) could be 466664. That's only 3000 numbers now, but they are big numbers, requiring a doubleword (four bytes). That has saved you only two bytes per line (6000 in total) over using the single numbers in character form.

But reducing the situation to three states means that the biggest possible value for a calculated line of BestStrategy is 666: that comes well within the limits for a single word number (two bytes), saving an additional 6000 bytes (in our example of 3000 entries). But how can you separate the three single digits of a number like 466? See the section Separating Digits below for an easy method.

Look again at the diagonal calculation for best strategy. We have hidden states 4,5 and 6 - so now our diagonal has to 'reflect': from line 16 we must look at values in cells (15,1), (14,2), (13,3), then (12,3), (11,2) and (10,1).




If we introduce space efficiencies into our storage of strategy calculations, we will deal with only three states, (1,6), (2,5) and (3,4). When calculating the best strategy for each line in our chart representing a different Total, we now have to modify things to inspect the results in a sort of "reflected" diagonal: tipping the die to 5 when the Total is 16 places the game to Total 11, state (2,5). This is easy enough to do when we remember that in the 'dual' state (x,y), x = 7 - y.




Now we are armed with enough power to build a program that plays TIP perfectly. First of all it calculates the best strategies for all Totals and all States up to the required Total. Now it can be set to play an opponent interactively. On its turn, if simply consults the BestStrategy array it has built, and make the required tip of the die and pass the move back to the opponent. If it finds the value zero as its best strategy at any point, it should do a tip that prolongs the game most, giving more opportunity for the opponent to make a poor tip. So if it finds a zero, it should tip to 1 if the state is (2,5) or (3,4), or tip to 2 if the state is (1,6). The only exception to this is when the Total is 0 or 1, in which case the game is lost, so the program should concede.

Interestingly, you could arrange it so that the opponent to the program is another copy of the identical program. In this case, one or the other will quickly latch on to a winning strategy, while the other will mince along tipping up only ones and twos in a doomed effort to prolong the game.

You may find it fun to add some cocky, self-important statements as banter that the program will give when it finds itself on track for a win. It could also be programmed to whine a little if is left in a losing position for several consecutive moves. And best of all, it could be made to become exuberant if the opponent allows it to latch on to a winning position after it had anticipated losing earlier. Of course if the program plays its twin, no mistakes will be made, so from the very first one will consistently whine while the other just as consistently boasts.

Looking for a Cycle

There is one other consideration to be made: not only in this game, but in many processes or combinations of processes where there is the possibility of repetition. Because each line of the best strategies calculations depends on the previous six lines of best strategies, repetition could occur. If the six lines immediately before Total 'q' are identical to the six lines immediately before Total 'r', then the Total 'q' line itself will become identical to the Total 'r' line. From then on, the twinning continues, and we end up with a cycle. Now this is good, because it means that we can project values for much higher Totals instantly.

In TIP, you will find that lines for Totals 9 through 14 are identical to lines for Totals 18 through 23. Lines 15 and 24 are of course then the same, and so are 16 and 25 - and then 17 and 26 and so on. So we have the pattern of a cycle within our table: lines 0 through 8 progress, and then lines 9 through 17 are repeated over and over again endlessly.

Once we have noticed this, we can stop generating lines in the BestStrategy array. For any particular Total, we should be able to determine which of the entries 0 through 18 corresponds to it. For example, Total 1000 is calculated this way: (1000 - 8) mod 9 + 8 = 10. We subtract the 8 for totals 0 to 8, and use 9 as the cycle length. The result of the modulo operation is 2, so when we add on the first 8 subtracted off, we find that line 1000 is the same as line 10. Thus we only need the BestStrategy array to have values for lines up to the end of the first cycle.

Of course you cannot tell in advance of calculating where the first cycle starts, or how long it is. But once you find it, you can stop analysis, construct your calculation to turn high line numbers into low numbers (within the first cycle), and you are ready to employ the strategy to win the game.

Separating Digits

In the course of this month's workshop we found that we could save on storage by combining single digits into a number: so when we came to 627 it was really to be regarded as three separate quantities: 6, 2 and 7. It is easy enough to construct the number from its component digits in the first place - but how readily can we disassemble the number into individual digits?

There are two tools we can use for the task: integer division by 10, 100, 1000 and so on, and the modulo operator. Integer division is ordinary division, with no remainder taken into account. In Basic it is represented by a 'new' arithmetic symbol: the backward stroke [\]. The forward stroke is used, obviously, for ordinary division, so it is relatively easy to remember integer division, and use it in calculations. For example, 627 \ 100 = 6. That is fine for isolating the leftmost digit of our number, but how do we get at the others? That's where modulo comes in. It is in a way the opposite of integer division: instead of ignoring the remainder, it is the remainder of a division itself. For example, 627 mod 100 = 27. In formal mathematics, the modulo operator is written a little differently, but this is how it appears in BASIC. That doesn't isolate the middle digit - or any of the inner digits with longer numbers - but it can isolate the units digit with the expression 627 mod 10 = 7.

We can get at the inner digits with combinations of integer division and/or modulo: to isolate the second digit, you can code (627 mod 100) \ 10. If you wanted the 100s digit of a number that could be in the thousands or higher, you could isolate it with the expression (Total mod 1000) \ 100. Notice that the integer divide identifies (by its size) the digit to be isolated, and that the modulo term is ten times that.

Both integer divide and modulo are designed to work with integers (word or doubleword format numbers). You should be careful when using this technique to utilise syntax that ensures that the quantities are treated as integers rather than as Real numbers.

How to contact Wilf

IÆm always pleased to receive letters and e-mail with programming queries, ideas and opinions. As a strict rule I canÆt reply directly with personal one-to-one programming advice, but your input could form the basis of a future Workshop. You can e-mail me at whey@futurenet.co.uk, Fax to 01225 732295 (+44 1225 732295 from outside the UK) or write to: WilfÆs Workshop, PC Plus, Future Publishing, 30 Monmouth Street, Bath BA1 2BW, United Kingdom