Morpion DNA Encoding

Posted on May 21, 2011 by Dimitri Tishchenko

My current algorithm relies on an encoding I call Morpion DNA encoding. This encoding has shown great resiliency to modification of internal structures while maintaining a grids general form.

All dna based algorithms depend on a set of operations; exhibition, modification and interpretation.

Morpion DNA:
I work with a 40×40 grid. Each position on the grid can have the possibility of being 4 different lines (E,S,NE,SE). This uniquely captures all possible lines that would be required to express any (known) grid.

The encoding relies on an array of length 6400 where each element in the array is initialized to the index value.

Array.new(40*40*4){|index| index}

The preference of any line is defined by the integer in the dna at a give index. The index is calculated as follows:

(line.x + 15) * 40 * 4 + (line.y + 15) * 4 + DIRECTION_INDEX[line.dir]

Direction index is an arbitrary (yet consistent) ordering of the directions possible at each point.
DIRECTION_INDEX = { :ne => 0, :e => 1, :se => 2, :s => 3 }

Interpretation:
In order to interpret a dna string, consider the set of possible moves available. Rank each possible move by its preference. The actual position of the move in the line does not matter. Make the highest rated move and continue until all possible moves are exhausted.

Eg.
Initially there are 28 possible moves. Find the move whose line has the highest preference, then make it. Now there are between 28 and 26 possible moves. This process is then repeated with the new possible moves.

Interesting note:
An ascending ordered dna string creates a grid of 35 moves.

Exhibition:
In order to make modifications to an existing grid a dna string needs to be generated that, when interpreted will yield the same configuration. You begin with a random dna string and then ranking the taken moves of the gird. This is accomplished by starting at the highest available index (6399 in my case) and working backwards while ranking the taken moves of the grid.

Modification:
The method of dna modification that I have found most effective is to swap two elements in the dna where one of the elements corresponds to a line of the grid that is exhibited on the grid. The number of times this is done depends on the algorithm.

Ongoing work:
I am still experimenting with different modification techniques. I am attempting to unify the Morpion move layering with the dna technique.

This entry was posted in Uncategorized. Bookmark the permalink.

15 Responses to Morpion DNA Encoding

mrdarek says:

August 8, 2011 at 10:06 am

Congratulations good results in search best morpion solitaire table. I’m beginner and I try do new world record in my program. At this moment I not have good results (best=102…). My program using nested monte carlo with rollout policy and I also using genetic algorithm like you – I using “roulette wheel” for selecting best moves… I’m interesting your “preferences” for best move – only preference i know and using is “more moves after”. Do you use other preferences? I think is not good idea calculating boards on fixed field – my program resizing minimal board if need – I think that it speed up my calculation… I think that soon I will be close world record, simpler algorithm have also advantage – less calculation – more situation are analysed. Thanks for blog about morpion solitare! If you have an amiga OS4 computer you can try my program on this operating system

Reply
Dimitri Tishchenko says:

August 9, 2011 at 4:07 am

102 is a great score. I was really happy when my program was able to beat my record by hand.

Can you describe your genetic algorithm? I tried to develop one for Morpion Solitaire but couldn’t quite figure one out.

My preference string is just a recording of moves that are taken on the board. It has no intelligence. I can’t come up with a good move selection strategy. I tried the “more moves after” strategy also but its not always the “best” move.

My initial program played out on a dynamic board. I now do all the processing on a 40×40 grid. It is more efficient to have a fixed board for the types of operations that I perform.

Glad to hear there are others working on this amazing game. Good luck!

Reply
mrdarek says:

August 9, 2011 at 6:44 am

OK I describe my algorithm but give me credits if beat world record!:
(reading some articles about roulette wheel selection method in google is recommended before read my solution…)

I have table of possible moves. Each current move has own weight=possible moves after current move. So I create roulette wheel:
1) calculate sum of weight
2) set population number – how many current moves will be used to select 1 new random move=population (in my program population=max possible moves*10)
3) calculate probability for each current move=weight of current move/sum of weight=probability
4) calculate how many repeat current move on roulette wheel member=probability*population
5) now roulette is build – time to select 1 random move from all members on roulette wheel!!! Low weighted moves are take less space on roulette. No need sorting or searching for best move. Still possible weak weighted moves but with less probability. It possible more prefering good weighted moves.. For this I do after 4: member=member*member
I never know if selected move will be good but for this take care nested monte carlo algorithm. Roulette wheel is heuristic for selecting moves in nested monte carlo algorithm described by Chris Rosin.
I think this heuristic can be easy used for any morpion solitaire algorithm, your too.

Reply
Dimitri Tishchenko says:

August 9, 2011 at 11:02 pm

Very interesting. It sounds like a very promising technique.

So your genetic algorithm is for creating the configuration of the roulette wheel selection strategy?

Reply
mrdarek says:

August 11, 2011 at 9:35 am

Not exactly-creating roulette is part of algorithm. This algorithm goal is help select better move then random move.

Reply
Zbigniew Galias says:

October 22, 2011 at 9:28 pm

Dimitri. Your method looks interesting. I would like to compare you method with the Chris Rosin method, but I do not fully understand how your method works. If you have a DNA code (for example in a move layering model) for a game and you modify it (you say that you swap two elements, which I guess means that you change at a certain position ‘0’ to ‘1’ and at another position ‘1’ to ‘0’) then the end of the resulting DNA code may become invalid. If you then want to have a complete game you have to play it randomly. If you do that I do not understand how the method differs from a random monte carlo search. Please reply here or to my email, which you can easily find in the web.

Reply
- Dimitri Tishchenko says:
  
  October 22, 2011 at 10:55 pm
  
  Modifications made to a move layer encoding disrupt the characters that appear after the modification point. Therefore modifications made to a move layer encoding is similar to a monte carlo search where a random completion is performed from a given point.
  
  Think of the move layer encoding as the most (that I have found) efficient method with which to encode a configuration. It cannot easily be used in a search algorithm.
  
  My algorithm makes a modification to the uncompressed encoding described in this post. This encoding can more easily be thought of as a preference string for all possible moves.
  
  The advantage that I think my technique has over a monte carlo search is that it has the potential to “remember” moves in the future.
  
  In terms of modifications, I have determined that swapping elements is not an effective modification technique.
  
  My current preferred modification technique is simply to turn off a used move. This throws away the current move and attempts to use the rest of the unmodified encoding. If this is a critical move to the configuration you will get a significantly lower score. If it is not a critical move then you will get a like score with a different configuration.
  
  For example imagine a random preference array is generated as an encoding. You can evaluate the array and get a set of used moves. Take one of those moves and determine its position in the preference array. Set that index to 0 so that move is not chosen and reevaluate.
  
  There are certain intricacies in the algorithm that make it more or less effective which is what I’m currently working on.
  
  Thanks for your interest and let me know if you would like any further elaborations on any points.
  
  Reply
  - Zbigniew Galias says:
    
    October 23, 2011 at 1:54 pm
    
    Thanks for your reply. It means that in fact you do not use layer encoding in the search process. Am I right?
    It is still not clear to me how your algorithm “remembers” future moves. If you remove a certain move from the DNA code than many future moves become invalid, so the game is truncated. To have a complete game you need to continue the game somehow, perhaps using some random process. Unless your method uses some clever idea how to continue the truncated game to make it similar to the initial game, but without the removed line, the method is basically a monte carlo search.
    Another question is what is the strategy of accepting new games. Is a new game accepted only if it is better than the previous one? This would perhaps lead to a very little chance of improvement, especially for longer games, where the chance that a random game is better than the previous one is very low.
mrdarek says:

October 23, 2011 at 6:05 pm

Hi !!!!
I also not very understand Dimitrii method but I’m sure – he soon beat WR! Because is only blog about morpion solitaire I want say about my new heuristic. Currently I still using Chris Rosin NRPA strategy with 2 heuristic. First is “density”. If game calculate that generated board is more dense then other – it going in this direction. What is density – every should understand this intuitively. Second – supportive heuristic is “move near previous move” – this was described in some articles so it’s no new…
I still using roulette wheel selection as best method for selecting promising moves.
Heuristic “more moves after” is wrong! Games build on this heuristic are un-dense and no have chance for good result.
My current record is 133!!!
And keep eye on my website lineapolis.dizzy.pl – AROS version will be soon!!! So if you have x86 machine every can try my engine and try beat world record!!!

Reply
- Dimitri Tishchenko says:
  
  October 23, 2011 at 6:45 pm
  
  Amazing progress! I’m very interested in what you mean by density. How do you do the calculation?
  
  Reply
Dimitri Tishchenko says:

October 23, 2011 at 6:38 pm

You are correct. Since the move layering encoding is so compressed any modification renders the rest of the string invalid.

I will do an example:
oiQZU+xqYvpUVoojdFKKGTEP/c3tx/3u6z6m/qLn6/H9/ is a move layer encoding for a 154 I am currently searching. Changing a character in in the string half way through, lets say the G, would render all the character after it useless and a random completion would be required to have a complete grid. This is the monte carlo tree search method.

However the encoding I use for the search is not move layering. I use a move ranking array which I refer to in this post as the DNA. I’ll start from the beginning and try to outline the search process step by step.

Move(Line) Ranking Array (DNA):
Imagine starting a new game on a sheet of paper. This sheet of paper has a 40×40 matrix of dots where possible moves are made. At each dot there are four possible lines that can be made ne, e, se, s. This gives us all the possible lines that can be made on any given board of this size. The size of the array is 40*40*4 = 6400.

What I do now is rank all the possible moves on the board with an ordered array of numbers from 0 to 6399. For example lets consider a sequentially ordered set of numbers in the array (interesting note: this creates a configuration of 35 moves). I now have a complete ranking for all possible moves. This means that when I am asked to make a decision between a given set of possible moves, I can find the index of the line of the move (each move is defined as a point and a line through that point) and answer the question based on the index of the line within my move ranking array.

Lets examine the initial state of the board. You are given 28 possible moves and asked to make a decision; Which move should I make? You take the 28 moves and look up their preferences in the ranking array. The highest ranked move is then made on the board. When the move is made, some moves might be created and some might be removed. This leaves with us with a new set of possible moves. The moves are then ranked and made until no possible moves exist. This is slightly wasteful because there are some moves in the corners that will never be made and there are lines that run off the bottom and right hand side of the grid. However the simplicity of the encoding warrants some inefficiencies.

Modification:
Lets examine our sequentially ordered ranking array that produces a 35 grid. We take the first move (highest ranked out of the first 28 moves) and set its ranking to 0. Now re-evaluate the board. It could be that that that move was critical to the integrity of the 35 grid. In which case we will get a lower score and ignore the modification. It could also be the case that that move was not that crucial and some other moves were now chosen to form a new configuration that is of higher or equal score. How this works is that the modification is localized to a region of the grid. If the move was not critical, then when evaluating moves from the modified ranking array, there is the possibility of maintaining current structures of the 35. This shows the “remembering unrelated structures” part of my algorithm.

I can show this algorithm grows quickly and steadily. Given enough time it can get very high scores. There is a problem though. It can get stuck at certain configurations. These configurations can be searched for days and not yield any improvement. For this, I have developed a type of taboo search that taboos searched configurations. My current research is focused on four areas; How many modifications should be made at a time to yield a new configuration?, How long to search a given configuration before adding it to the taboo list? How to choose which configuration to search next? and What is the most effective way to parellelize this algorithm to find better results faster?

Reply
- Zbigniew Galias says:
  
  October 24, 2011 at 6:24 pm
  
  Dimitri, Thank you for the detailed explanations. This was however not exactly what I was asking for. I already understood how you encode a grid, and how you modify it. I was asking what you do when there are no moves possible in your game, but there are still some moves, which can be done. Inevitably, if you remove one line from the existing game, it usually becomes truncated. Do you stop? Do you make any random moves? If so, do you use any heuristics what to do to make the new game similar to the current game, but without the line(s) which you removed.
  Since I did not know how you continue the game, I implemented your idea that the new game should be as similar as possible to the existing game by myself (although I do not use any DNA encoding, I just play a new game remembering the moves not used yet). I confirm that the idea is very good when compared to a random search, and works quite well. In a couple of runs (1000000 iterations of generating new game each) I reached a score of 150 lines. Much better than with a purely random search, where a result of 100 lines is difficult to get. I also confirm that sometimes the algorithm gets stuck on a low score solution (sometimes as low as 100 lines) for many iterations.
  BTW, obviously Darek is not right concerning the size of the table. 40x40x4 is the correct size with the grid of 40×40 points.
  
  Reply
  - Dimitri Tishchenko says:
    
    October 24, 2011 at 10:50 pm
    
    Due to the nature of the move ranking array, any modification will generate another valid ranking array representing another grid that will be as similar to the original as possible.
    
    I guess what I’m trying to say is that the move ranking array just takes care of it for me. Take a look at the exhibition section of this post. The moves that are exhibited in the move ranking array are the highest possible rank numbers. If I remove a move the highest ranked moves will be used (if possible). If those moves are no longer available then a random move is chosen. Again because I use a randomly generated move ranking array this is taken care of for me automatically. My algorithm doesn’t make any decisions on how to make moves. It makes modifications and sees the results.
    
    Many of my runs get stuck in the mid 150’s and only a few can make it into the 160’s.
    
    Let me know if some parts are still confusing. I would also love to hear of your progress and work on the problem.
mrdarek says:

October 24, 2011 at 2:10 pm

You do move with highest rank, but what determined that rank is higher – only position on board. You just creating systematic searching. Every move is analysed, then you go one level deeper and analyse next set of moves. No any heuristic!
Why you do a move from move set – you must analyse all possible moves. You really need heuristic. You can still rank your moves from 0 to final move but highest ranked move should be determined by heuristic – not position. Heuristic tell you that some moves are better then other. For your situation my density heuristic will be best – you should test first moves capable creating more dense board – it work for me.
How calculate density. At current level some points has 0 lines and some max 8 lines. Position with more lines are drawed from points is preferred. For example if you place an line that change situation to more lines (preferred 8-max!) going from points – it must be much better then lines on empty points – which create 1-2-2-2-1 dense line. In my density heuristic lines from points are scored in no-linear function. for example point with 7 lines are scored to 128 and point with 8 lines are scored to 256 or even 512. Low lines from point are (-scored) so my algorithm know that not should it draw because improvements on my algorithm is not best score but best calculated density.
And are you sure that you analyse all moves? From single points is 4 direction but on any direction is 5 position. So possible rank for your table should be 40x40x4x5.

Reply
Marcos says:

February 6, 2014 at 2:56 pm

I remember using a similar approach for another board game. Each “individual” in my GA was a permutation of all possible moves in the game; the solutions were evaluated starting with the initial position of the game and making, in each turn, the first move that was legal. It was simple to code, but somewhat slow (it iterates each turn over all possible moves).
The mutation I used was a simple swap of two moves.

Reply