I have just recently finished writing a 3D-Tic-Tac-Toe program, which works fine, except that sometimes it makes a mistake in it's move.. This problem is attributed to a set of weights which define how the game plays. What I need to be able to do, is find out a global maximum for these weights, instead of what I am doing now, which is working out values in my head and playing them off against one another.. This works for a while, but of coarse is not a good way of doing it.
The next thing I tried was modifying the losing set of weights after each game in the direction of the winning set by some random factor. (This was not a good idea either.) Even if this eventually did find some sort of best, it would only be local to those values which I started with.. I really need to find a global maximum..
The program obviously has to work on some sort of evaluation function. The bulk of mine goes like this..
Where: Score is a score obtained for any one possible board position.
Weight is a set of values by which you can indicate order of importance.
C_Rows[n] is the Number of Computer 'n' in a Rows.
P_Rows[n] is the Number of Player 'n' in a Rows.
From this we can deduce that the higher the score, the better it is for the computer, and the lower the score, the better it is for the player..
After doing further research and reading I came up with this..
The word ply refers to the depth of search..
If an evaluation function is ideal, the score obtained by applying the function directly to a position, would be the same as the score obtained as a result of a look-ahead search from that position.
Lets call the score for a root position 'RT' and the score which was backed up to that position during the previous tree search (two-ply ago) 'BK'. If the tree search is normally n-ply, the score RT will be the result of an n-ply search, whereas the score BK, although arrived at during an n-ply search, is only the result of a search to depth n-2. So RT is a more reliable score than BK.
Now, if RT-BK is positive then the value of RT is in error, and the terms in the evaluation function which contribute positivly should be given more weight, while the terms which contribute negativly should be given less weight, and obviously the converse is true..
With this system it would be possible to make modifications to the weights with each turn rather than with each game, and thus find the best set of values much quicker..
The problems are how much to modify the contributing factors by, and which one is causing the error (I've got 8). Once again this will only be able to find a local maximum (From what I can see.)
The next thing I did was to take each weight and add the score obtianed by (RT-BK) multiplied by some constant greater than 0, but less than 1 and multiplied by the number of times that factor occured..
If anybody has any suggestions on how to further continue this, or has a totally new appproach which will work just as well, to find a Global Maximum set of Weights, could you please post help out.
I would prefer to keep to integers, as longints and real are considderably slower to work with, and if you considder that for each move, at a depth of 3,
64*64*64 calculations have to be made... (Well..)
^^^^^^^^
Obviously this gets decreased throughout the duration of the game..
Any help would be appreciated, I don't know what else to do...
See ya,
Grant
--
INTERNET: Grant.Reid@p5.f120.n7102.z5.fidonet.org
via: THE CATALYST BBS in Port Elizabeth, South Africa.
(catpe.alt.za) +27-41-34-1122 HST or +27-41-34-2859, V32bis & HST.