gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] The Module Formerly Known As Endgame


From: Eric Parker
Subject: [gnugo-devel] The Module Formerly Known As Endgame
Date: Fri, 17 Sep 2004 18:29:24 -0700 (PDT)

My idea is based on the paper:

A Solution of Go on 4x4 Board by Game Tree Search
Program, Shinichi Sei, Toshiaki Kawashima, the 4th
Game Informatics Group Meeting in IPS Japan, pp.69-76,
2000.

The paper can be viewed at:

http://homepage1.nifty.com/Ike/katsunari/publications_e.html

For non-Japanese language speakers, there is the minor
inconvenience that the English translation of the
paper does not include the search tree figures. So it
is necessary to print both the English and Japanese
versions and to cross-reference them. But they are
both short papers.

Anyway, in the paper, the authors are able to find the
perfect play for Go on small boards, using game tree
search programs!

They set as their search goal simply: "Win the game
for black". Then they exhaustively search all legal
moves (by Japanese Go rules), using just some
node-ordering heuristics, the minimax method, and
alpha-beta pruning.

They are able to find the perfect play for Go on 2x2,
3x3, and 4x4 boards.

The point is that there are many ways to improve upon
the authors' approach; not the least of which is to
not rely on search exclusively to solve the entire
problem.

Think of it this way:

What happens if we replace minimax search and
alpha-beta pruning by more recent search tree pruning
techniques, e.g. recent advances in AI Planning?

We could probably get an increase in the size of the
board we were able to solve. Say that now we were able
to solve 5x5 boards.

Next, we relax the search goal. Instead of saying "Win
the game for black", we could say something like "Take
prisoners". This is entirely reasonable. Surely some
Go games must exist in which taking prisoners leads to
winning. These are the games that the planner will
look for.

Because we have a relaxed search goal now, we would be
able to solve 6x6 boards (i'm just pulling these size
increases out of thin air). Right? Because we're not
trying to win the game anymore. In the new scenario,
"solving" the 6x6 game board just means constructing a
plan for taking prisoners on that board. This
relatively simpler goal can (sometimes) be solved with
less effort.

Next, we continue on distributing the search by making
the assumption that new plans must be constructed each
time it is the program's turn to play.

What this means is that we don't need to maintain any
global data structures between plays, e.g. we don't
save any plans. We make a plan, take the first action,
then throw the rest of the plan away (this approach
nicely skirts all kinds of nasty "plan execution
monitoring" issues that some folks claim we must do
;).

Because we re-plan everytime we want to make the next
move, now we would be able to solve 7x7 boards. Right?
Because it is no longer necessary to search through
the opponent's possible moves! Surely some Go games
must exist in which it is possible to capture some of
the opponent's forces that are already on the board,
without the opponent making any further moves that
help you to capture him. These are the games that the
planner will look for.

Suppose we get lucky and find enough of these kinds of
tricks so that the planner is able to "solve", say,
9x9 game boards?

The games that the planner can play on one of these
9x9 game boards can be viewed as one big pattern,
where "matching the pattern" means "solving the
board", i.e. finding a plan for capturing prisoners on
the board.

Since we now have an analogy for what GNU Go does,
then the rest of the implementation can just copy the
way GNU Go does it:

GNU Go extends smaller patterns to the 19x19 Go board
by looking at all the possible ways that a pattern can
be placed on the board. But for GNU Go, the pattern
matching step itself is very inexpensive - a single
operation, I would guess.

But for the Planning approach, the "pattern matching"
step is to search. Search is expensive. This means
that we cannot be so generous with our "pattern
matching" as GNU Go is.

An idea is to do the search for each dragon that
appears on the 19x19 game board. Put differently, draw
a 9x9 square around each dragon, with the dragon in
the center. Note that some squares may be smaller than
9x9 (because some dragons will be on the edges of the
19x19 board).

This is entirely reasonable. The planner's goal is to
"Take prisoners" and, for example, it helps to have
some opponent forces on the board who can potentially
serve the role of the prisoners. Right?

The 9x9 may be too small/large for some dragons. But
for other dragons, 9x9 may be just the right size.
These are the Go games that the planner will look for.

Next, we execute the planner on each of these 9x9 game
boards that result. If the planner finds a plan for
taking prisoners, we return a move (the first action
of the plan) and a move reason (the rest of the plan?)
to genmove(), or whatever is the name of it, for
valuation. If the planner does not find a plan, GNU Go
still has the standard pattern matching to depend on.

With all of the above being said, I would like to make
a final comment: One suggestion I got from this list
is that there is more to playing Go than just taking
prisoners (I believe it was Gunnar's, but I could be
wrong).

In any case, it is my feeling that such an argument
loses sight of the fact that we are using search.
Every single detail does not have to be mentioned. All
of that "more" stuff exists there in the search space,
and as long as it doesn't prevent the search goal,
then "more" will automatically be considered, along
with the knowledge that we explicitly mention to the
search.

On the other hand, it is also possible to do something
like this: Search for all the ways that my opponent
can capture my forces. Here we are cheating to get
around the Plan Recognition problem. Plan Recognition
asks - here's a plan, what's the goal of the plan?

But if we are already looking for all the ways that
the opponent can capture my forces, then I already
know the goal of my opponent's plans! Right?

So once I get my opponent's plan, I need simply defend
against it. The obvious trick to try here is to deny
my opponent the ability to execute his plan, i.e. by
me making one of my opponent's moves. But which one?

Well, there is standard pattern matching knowledge
about making progress on the Go board, defending
territory, etc. So the moves that GNU Go's standard
pattern matching suggests for this same point in the
game should be considered as well.

Put more succinctly, first, invert my opponent's plan,
i.e. flip the colors of all the stones in my
opponent's plan. Second, intersect the inverted plan
with the set of my friendly moves suggested by GNU
Go's standard pattern matching. Any of these moves in
the intersection set should safely provide the
defensive block.

Finally, there was a suggestion on this list to use
Planning in the opening game, instead of in the
endgame. But the way I outlined the approach above, it
relies on having forces on the board to "think" about.
For example, the planner wouldn't be useful for
generating the very first move of the game, because
there are no dragons, and so the planner will never
get fired. This suggests that there is some point
after starting the game in which the planner should be
useful.

So that's my whole idea. Seriously folks, this idea
just needs a little bit of work to make it elegant.
The foundation is sound. In fact, I claim that if we
can get even this naive strategy implemented, then it
would knock the socks off of an exclusive pattern
matching GNU Go.

Eric





reply via email to

[Prev in Thread] Current Thread [Next in Thread]