[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Bug-gnubg] User training of the Neural Nets
From: |
Ian Shaw |
Subject: |
RE: [Bug-gnubg] User training of the Neural Nets |
Date: |
Fri, 25 Aug 2006 16:08:22 +0100 |
Øystein Johansen wrote on 23 August 2006 19:43
> 1. Get the gnubg-nn code
> cvs -d:something:blah co gnubg-nn
So there's a separate program for developing NNs? I'd be interested in having a
look. What do I need to do?
> 3. Before you start training anything:
> Steal the neural net evaluation code from gnubg, the code
> that uses SSE, and apply it to the code ing gnubg-nn.
> This step will save you a lot of time in the traing.
> (commit the changes back to the cvs)
Am I right in thinking that the 5-node pruning nets do not use SSE
vectorisation. At one point, you were considering implementing this. Did
anything come of it? IIRC, you also mentioned increasing the hidden nodes to 8
- because the loops have to be in multiples of 4.
> Here's where I stranded... It worked it worked! I could breed
> new nets, but none of the nets I trained was significantly
> better than the original onem no matter how long I trained.
>
> 6. A programmer can now try out different things, like further
> splitting of neural nets, or altering the inputs, or guessing
> other algorithms thar might work.
>
> Look at the different hand crafted inputs, can anyone be
> removed? Can anything be added? I believe there is code to
> dynamically add and remove nn inputs. If you add a input
> make sure you add a new 'concept' and not just something
> that's linearly depending on some other inputs.
>
LINEARITY
Can someone clarify what is meant by "linearly". Tesauro has also mentioned
this and I would like to ensure I know exactly what is meant in the nn context.
For example, I would call the pipcount linear because it is simply the sum of
all the chequer distances from home. Indeed, pipcount is not in the gnubg input
set.
I would call home-board strength non-linear, because the number of dancing
rolls is proportional to the square of the number of points closed.
Is this correct?
NEW NEURAL NET INPUTS
I spent some of my holiday reading eval.c, trying to understand the current set
of nn inputs .(Much to my wife's discomfort: "Ian, have you brought work on
holiday?" "No." "Well it looks like work." "Yes, but it's much more fun!")
I was inspired by the idea of trying to capture some of the concepts Robertie
espoused in Modern Backgammon. This has the appeal of continuing the
man-machine feedback loop, since the theme of Robertie's book is to explain
concepts have been learnt from the bots. If you've not read the book, the four
concepts are Efficiency, Connectivity, Non-commitment and Robustness. These
don't seem to be explicitly encoded as nn inputs (though they may be included
in the I_MOBILITY and I_ESCAPES features, which I don't fully understand yet.)
The first two are probably easier to implement. Here are some ideas -
suggestions welcome.
Connectivity could be measured by the number of friendly chequers up to six
points ahead of each chequer, or the number of rolls that join one chequer to
another. However, this seems quite linear to me, so might to help.
I was considering point-making rolls as one measure of efficiency. This tends
to be the square of the number of occupied points, so I see it as non-linear.
Similarly, one could count Point-on-Head rolls.
The distribution of spares is another obvious measure of efficiency. However,
this is already encoded in the basic board structure.
The common factor in these suggestions is that they attempt to look ahead to
the players next roll. Perhaps they can identify some of the tactical
advantages currently only found by 2-ply analysis.
It might make sense to split the board into two halves for these measures,
since localized tactics on your side of the board are often different to
tactics on the other side of the board. For example, we try to build primes on
our half of the board, but run to safety with the back men.
For Non-commitment, perhaps one could define an input that measures the
"purity" of a position.
Robustness might be measured as the degree of freedom each chequer has to play
each die 1-6 in terms of being unblocked or not deep in the home board.
Joseph mentioned that he had tried and discarded various hand-crafted inputs.
Is there a record of what has been tried already? (I assume that the inputs in
eval.c CalculateHalfInputs are all actually used by the nn, not deemed
unhelpful and weighted to 0 elsewhere.)
BASIC BOARD ENCODING
Has anyone tried modifying the basic board encoding recently?
For each point, there are three boolean inputs and one integer input per point.
The boolean inputs are set true for 1, 2 and 3 chequers on the point, i.e. a
blot, a point, and a spare. The integer input counts chequers above three.
AFAIK, this encoding was first used by Tesauro in the early 90s. It worked well
so it seems that everyone has used it since. Perhaps it was the best
configuration at the time, given the computing power available, but today's PCs
are about 10 times as fast and have oodles more RAM.
It doesn't strike me that that this encoding is naturally best. Does it not
imply that there is a linear relationship between the number of excess spares
on a point? I don't think this is true. For example. A fourth chequer on a
point is often good, allowing one to make points with doubles, or slot and
cover on consecutive rolls. It is usually only once we get a 5th chequer on a
point that we consider it to be "stacked". Even then, an opening 65: 24/13 is
good.
The boolean encoding of features was the stroke of genius; perhaps it should be
extended. One could try boolean inputs for fourth and fifth chequers, saving
the integer for serious stacking. If the extra 50 or 100 inputs are too much,
perhaps just the points that commonly get heavily loaded could have additional
boolean inputs, the six-, eight- and midpoints.
Again, I'm aware that the issues I'm considering have already been tackled, so
any pointers would be most welcome.
-- Ian
- Re: [Bug-gnubg] User training of the Neural Nets, (continued)
- Re: [Bug-gnubg] User training of the Neural Nets, Christian Anthon, 2006/08/23
- Re: [Bug-gnubg] User training of the Neural Nets, Øystein Johansen, 2006/08/23
- Re: [Bug-gnubg] User training of the Neural Nets, Joseph Heled, 2006/08/24
- RE: [Bug-gnubg] User training of the Neural Nets, Albert Silver, 2006/08/25
- Re: [Bug-gnubg] User training of the Neural Nets, Joseph Heled, 2006/08/24
- RE: RE: [Bug-gnubg] User training of the Neural Nets, Massimiliano . Maini, 2006/08/25
- Réf. : Re: [Bug-gnubg] User training of the Neural Nets, Massimiliano . Maini, 2006/08/25
- RE: Réf. : Re: [Bug-gnubg] User training of the Neural Nets, Ian Shaw, 2006/08/25
- Re: Réf. : Re: [Bug-gnubg] Us er training of the Neural Nets, Christian Anthon, 2006/08/25
- Re: [Bug-gnubg] User training of the Neural Nets, Christian Anthon, 2006/08/27
- RE: [Bug-gnubg] User training of the Neural Nets,
Ian Shaw <=
- Re: [Bug-gnubg] User training of the Neural Nets, Øystein Johansen, 2006/08/25
- RE: [Bug-gnubg] User training of the Neural Nets, Albert Silver, 2006/08/23
- RE: [Bug-gnubg] User training of the Neural Nets, Ian Shaw, 2006/08/25
- RE: [Bug-gnubg] User training of the Neural Nets, Øystein O. Johansen, 2006/08/25
- Re: [Bug-gnubg] User training of the Neural Nets, Christian Anthon, 2006/08/25