[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnubg] Re: Strange FIBS ratings
From: |
Christopher D. Yep |
Subject: |
Re: [Bug-gnubg] Re: Strange FIBS ratings |
Date: |
Wed, 10 Sep 2003 07:03:45 -0400 |
At 03:34 PM 9/9/2003 -0400, Douglas Zare wrote:
Quoting "Christopher D. Yep" <address@hidden>:
> I think this phenomenon has been known for many years now. Kees'
> experiments and Douglas Zare's research on Gammonvillage are just the
> latest examples supporting this conclusion.
Some have known it, others have not. I have been arguing that
As a very rough guess, I'd say that 3% to 30% of all backgammon players
know this fact today (that checker errors give up more equity than cube
errors). Those who own gnubg (or Snowie) and regularly use the Player
Records (or Account Manager) should know this fact, assuming they care
enough about their stats to review them regularly. Of the 70%-97% who
don't know the fact, some players grossly overestimate the importance of
the cube. One casual player told me that "it's easy to move the checkers
around, but that cube errors account for 98%-99% of the total equity lost"!
Humans have been wondering which is more costly (checker errors, cube
errors) for a long time, even before the concept of EMG was invented.
There are two different questions,
(1) Which gives up more equity in ppg/mwc (points per game for a money
game, match winning chances for a match), checker errors or cube errors?
(2a) Do players have higher checker error rates or higher cube error rates,
with error rates measured using Snowie methodology?
(2b) Same as 2a, but using gnubg methodology?
#2b is significantly different than #2a. #2a uses the same denominator for
both checker and cube error rates, so the ratio of (checker error rate) to
(cube error rate) is the same as the ratio of (total EMG given up by
checker errors) to (total EMG given up by cube errors). If I remember
correctly, gnubg checker error rate is the total EMG given up (checker)
divided by total number of unforced checker plays, while gnubg cube error
rate is the total EMG given up (cube) divided by the number of (actual or
"close" [based on some threshold] cube decisions).
I don't know the entire history of this thread (partly because it is
spanned across multiple threads; I haven't read all the e-mails). #1
interests me much more so I haven't commented yet on #2b, but I'm guessing
the thread was initially inspired by #2b.
The casual player doesn't have Snowie or GNU and is more concerned with #1.
1) Humans give up more equity through checker play.
2) Using EMG overstates the amount given up through cube play.
Many people have not been convinced (mainly weaker players), and I hope
that my
column will convince them.
Question #1 has interested me since I started playing in the early
1990s. When I bought Snowie in 1999 I checked my own errors. I was
surprised that my checker errors gave up much more equity than my cube
errors, but I used the intuitive arguments I gave earlier (mainly that
there are many more difficult checker decisions than difficult cube
decisions each game) to convince myself of the fact. I also downloaded 9
analyzed matches (all in 1999, analysed by Snowie 3) from Oasya.com (now
bgsnowie.com). I see that these matches have been taken down (except
Ballard vs. Meyburg at the Nordic Open 1999), but they've put up 13 new
ones in their place (http://bgsnowie.com/backgammon/matches.dhtml). If you
have time, you may wish to review these. I'd guess that these matches are
more reliable than those on Johanni's list, since presumably the decision
to record/display each match was made before the actual match was played (I
could be wrong though). Johanni's list includes only self-selected
matches, which may present a bias. If there is a bias, I don't know in
which direction it is, however I'll guess that Johanni's list is more
likely to exclude matches with large cube errors; after a match a player
may check a particular cube decision (but not many or any difficult checker
problems), then if he was grossly wrong on the cube decision be too
embarrassed to send in the match to Johanni. Additionally I think that
Johanni's methodology is to rollout cube blunders but not checker blunders
(someone correct me if I'm wrong). The last point is definitely a bias. A
countering bias is that Snowie (at least Snowie 3) does not include checker
errors in non-contact races but does include cube errors in non-contact races.
The second point is closer to what KvdD's experiments show. My data says that
human cube errors happen when less mwc is at stake, on average, than checker
play errors. His says that when gnu is told to play stupidly, its cube errors
happen when less mwc is at stake.
I thought that Kees' study centered around trying to estimate FIBS rating
based on two variables (1) gnubg checker error rate, (2) gnubg cube error
rate. This is more than just simply concluding that cube errors happen
when less mwc (or ppg) is at stake. His overall conclusion is quite
valuable in my opinion, but only if the results are trustworthy. The most
important issue that needs further study is whether using (gnubg with
noise) is sufficient to model humans. The advantage of using (gnubg with
noise) is that we can quickly develop a huge sample size. I appreciate
both bot and human work (with your investigation being the latter). Kees
is now studying human data which is the next logical step. Hopefully work
can continue in this area.
BTW, here are two intuitive arguments that cube errors happen when less ppg
is at stake in a money game (similar results apply in matches with respect
to mwc):
Suppose that a player's average (total cube errors in ppg) is X% of his
(total checker errors in ppg).
1. If every game ended in double/pass then we can partition each game into
periods based on the cube value:
1. Centered cube
2. 2-cube
3. 4-cube
4. 8-cube
Etc.
Each period ends when the cube is accepted (or passed in the case of the
final cube). We have assumed that the player's average (total cube errors
in ppg) is X% of his (total checker errors in ppg). It's reasonable to
assume that this ratio applies across each period above (note that the
final period ends with a double/pass). Thus a player's normalized cube
error rate (Snowie methodology, not gnubg's methodology) will also be X% of
his normalized checker error rate.
In actuality though, not every game ends in double/pass. For games that
don't end in double/pass, the final period will involve difficult checker
decisions, but not very many difficult cube decisions (if the game ended in
double/pass though, then a representative number of difficult cube
decisions could be expected in the final period). This is because the
player holding the cube at the end of the game is usually an underdog
throughout the final period, thus his cube decisions are easy (not always
though; sometimes he has to decide whether he is too good or not, if he
decides he is too good the cube will not be turned on that roll).
The overall effect of the above paragraph is that cube errors are more
likely to be made on smaller cubes than checker errors.
2. The above does not consider that on large cubes (cubes >= 2), the player
on roll has to (roughly) consider doubling only when he is both the
favorite and when he owns the cube, while on very small cubes (centered
cube) he has to consider doubling on *every* move when he is the
favorite. This further amplifies the effect that cube errors are more
likely to be made on smaller cubes.
Overall conclusion: the reported Snowie normalized error rate
(equivalently: EMG error rate in the case of matches) exaggerates the
effect of cube errors on total error in ppg. This agrees with your
conclusion. There are some minor modelling flaws in #1, but these
intuitive arguments were enough to convince myself when I first thought
about it a few years ago.
Thanks for the work. While it sounds like you don't want to be mentioned
in the same note as Kees, I think both your contributions are valuable and
I hope this is taken as a compliment.
Chris
- Re: [Bug-gnubg] Re: Strange FIBS ratings, (continued)
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Joseph Heled, 2003/09/05
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Jim Segrave, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Joern Thyssen, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Jim Segrave, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Joern Thyssen, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Jim Segrave, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Christopher D. Yep, 2003/09/09
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Douglas Zare, 2003/09/09
- Re: [Bug-gnubg] Re: Strange FIBS ratings,
Christopher D. Yep <=
- Re: [Bug-gnubg] Re: Strange FIBS ratings, kvandoel, 2003/09/08
- RE: [Bug-gnubg] Re: Strange FIBS ratings, Albert Silver, 2003/09/08
- RE: [Bug-gnubg] Re: Strange FIBS ratings, kvandoel, 2003/09/08
- RE: [Bug-gnubg] Re: Strange FIBS ratings, Albert Silver, 2003/09/08
- RE: [Bug-gnubg] Re: Strange FIBS ratings, kvandoel, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Joseph Heled, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Jim Segrave, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, kvandoel, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, Jim Segrave, 2003/09/08
- Re: [Bug-gnubg] Re: Strange FIBS ratings, kvandoel, 2003/09/08