[Pan-users] Re: odd newsgroup behaviour

pan-users
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-users] Re: odd newsgroup behaviour

From:	Duncan
Subject:	[Pan-users] Re: odd newsgroup behaviour
Date:	Tue, 28 Jun 2005 00:07:34 -0700
User-agent:	Pan/0.14.2.91 (As She Crawled Across the Table)
R Kimber posted <address@hidden>, excerpted
below,  on Mon, 27 Jun 2005 21:20:21 +0100:

> The problem is that when it starts up, Pan gets the headers from
> Leafnode and lists the number of unread articles in each group. Each
> group matches on new and unread articles. Clicking on a group allows me
> to see the specified number of articles listed, except in this one
> group. If I select uk.politics.misc, which might show several hundred
> unread articles, the number of unread articles immediately changes to
> maybe 75 and it displays only that number.  I am clearly missing out on
> many of the messages in this group.
> 
> I have tried unsubscribing and then re-subscribing a day or so later,
> but that hasn't made any difference.
> 
> I have:-  Pan 0.14.2.91 [] Leafnode 1.10.7

Interesting you are seeing this with a local news server.  The behavior
wouldn't be unduly surprising if it were a remote news server, but it's a
bit more so local.  I don't know exactly how Leafnode works.  It may
simply grab the group xref numbers off its trunk server (to continue the
leafnode analogy), reproducing them locally without change, in which case
the behavior would be coming from that trunk server, upline.

First, the reason you don't see the behavior change with an
unsubscribe/resubscribe in PAN.  The way PAN works, that still retains the
group state information, which is what you were attempting to clear.  With
the group unsubscribed, if you look in PAN's data dirs (under ~/.pan/data/
by default, so the .pan dir is normally hidden, but you probably know
that already), I bet you'll still see a number of files related to that
group. You have two options.  You can remove them manually by deletion
from the filesystem, or from within pan, you can choose delete group
instead of unsubscribe.  This should clean up all state info and allow you
to start clean, which will hopefully cure the issue. Note that deleting
the group from within PAN will mean redownloading the group list to get it
back, so you can resubscribe.  Of course, with a local leafnode, that
shouldn't be difficult, but just so you know why it isn't listed in the
existing group list once deleted.

Second, a bit of explanation of what is happening when that behavior
starts occurring in the first place, altho precisely why it does so in an
individual case remains unknown.

Normally, a news server will track the posts in an individual group by
assigning sequential xref numbers to posts as they come in, so the highest
number is always the newest post to have come in.  As posts expire, lower
numbers will disappear.  Thus, there will always be a range that exists at
any point, say 2374542-2374937, 396 posts inclusive (395+1, since number
2374542 is the first number still on the server, not 2374543).

Note that while this says 396 posts /may/ exist, because that's the given
range, a lessor number of posts may actually be /available/ for download. 
Some of the ones in the middle may not be there.  Think spam filtering
after the posts have been assigned numbers, for one common cause. 
Another common one is an expiration mechanism that expires larger,
probably binary posts, sooner than it does smaller text posts on the same
group.  Still another common one is remote numbering, posts numbered
centrally than redistributed to local servers, say at an ISP's local city
location.  In this case, transfer from the central numbering location to
the local server may occur out of order, so the local server could be
expiring things in the order IT got them, which might not be in precise
numerical order.

Now, the RFC (977) defines the reply to a client GROUP group.name
group selection request to be something like this (using the same example
as above, and assuming a successful group select):

211 402 2374542 2374937 group.name

211 is the protocol message number, similar to the http 404 page does not
exist error we are probably all familiar with from the web, only in this
case it's the NNTP message number indicating a successful group select
(2xx=status OK, x1x = newsgroup selection, xx1 is the individual status
message).

402 is an /estimated/ number of posts available for download.  Note
ESTIMATED.  It can be MORE than the number of actual posts available, as
we know it is in this example because doing the math, we can tell there
are no more than 396 posts available, but the RFC says it MUST NEVER be
LESS than the number of posts available.  Really, this number is intended
to serve as a guideline for the client, telling it how much memory it
should be prepared to allocate, if it wants to download all messages,
nothing more.  Thus, no harm in making it MORE than the number of messages
that actually exist, as long as it's reasonably so, but making it LESS
than the number that exist would mean additional memory might need to be
allocated later.

The definition of the third and fourth numbers should be apparent from the
above discussion, as the first and last posts available for download when
the reply was issued.

The last parameter is of course a text string with the group name.

Here's another slightly different example using the same basic numbers to
make a point:

211 326 2374542 2374937 group.name

In this case, the server knows there are less posts than the given range. 
That 326 is OK, as long as there are 326 or less posts available, not 327,
even tho doing the math on the range still says 396 possible posts.

I'm emphasizing the second number there, not because it really affects the
discussion at hand, but because its purpose and definition is often
misunderstood.  MSOE is one client that didn't get the implementation
right, as recent events on my ISP have demonstrated.  Apparently, it would
cope with that second example just fine, but the first one, 402, it would
NOT cope so well with.  It would expect six additional posts that it
couldn't download, and would continue to display six posts as undownloaded
and unread, even when it was all caught up.  Apparently, most servers
provide an estimate lower than the mathematical range figure almost all
the time, and the situation where the estimate is HIGHER than the range
would indicate is possible never occurred to the programmers and was never
caught in testing.  However, the RFC does NOT REQUIRE the estimate be
lower than the range, and on my ISP recently, the estimate was getting
higher than the range on occasion, screwing up MSOE users who ended up
wondering where the missing posts were going.  I believe I saw a
remark from someone using Mozilla that it got it wrong too,
unfortunately.  (FWIW, the servers in question were Highwinds, Tornado,
but again, note that they were RFC compliant, the client was not.
This just noted in case others see it at some point.  FWIW, PAN appears
to work right, in this instance.)

OK, on to the behavior at hand.  When PAN checks for groups with unread
messages, it gets the 211 reply explained above, which gives the
range of messages available and the estimated number.  It uses this info
to update its display with the new data (subtracting the number it has
already seen from that range), without actually trying to download the
overviews for each message. When you go into the group and PAN actually
grabs the overviews, it's not unusual at all for it to find there are less
messages there than the numbers given in the earlier 211 message, for any
of a whole host of possible reasons, a few of which were listed above.

In your case, however, you are seeing HUGE differences in the numbers. 
As should be clear, there's not enough info presented at this point to
know why that is.  However, there are several possibilities.  (1)  It
could be an issue with your "trunk" server.  (2)  It could be an issue
with your local leafnode server.  (3)  It could be some weird one-time
hiccup that hasn't been cleared from PAN because as described above, an
unsubscribe/resubscribe cycle doesn't clear PAN's group state.  (4)  It
could be something in PAN itself (see next).

Note that there's ONE known and still outstanding bug in PAN that COULD
cause this behavior.  It's an i18n (i-18-letters-n=internationalization)
bug, therefore a bit complex as it involves PAN's interaction with the
PANGO i18n component of GTK+ as well.  Previous to Charles' long hiatus
from PAN development (he's back now), there was some work done on the bug,
as can be seen from the PAN site's version changelog history.  He thought
it was fixed, according to the log, but the problem still exists.  Whether
his fix made things worse or better I'm not sure, but anyway...

PAN's behavior when it hits this bug is to log an assert error (therefore
turning the little log icon in the lower right corner from an information
icon to an error icon) and continue, ignoring that post.  The assert error
allows the continuation without crashing, but obviously, you don't get to
see that post.  Normally, this will happen only once in awhile, on
individual posts where somebody is using a non-ASCII charset.  However,
occasionally, it'll show up in an entire thread, generally because the
subject contains an unparsed character.

On most English newsgroups, you shouldn't see this too often.  Obviously,
however, if you visit a group that normally uses whatever charset is
causing PAN the trouble, many or most of the posts will fail to show up!

I don't know if this is the problem you are experiencing or not.  If so,
clicking on that now-red log icon in the corner should reveal lines and
lines of assert errors related to the i18n stuff, one for each of your
"missing" posts.  If this is the case, I'd suggest trying another reader
for that group.  It seems from your headers you are already familiar with
Sylpheed.  IIRC it (or Sylpheed Claws anyway) does news as well.  You
might try that.  Here, for text groups I'd try KNode.  For binary groups
I use klibido (due to its automatic multi-server handling).

As far as this bug goes, I'd ordinarily suggest adding additional details
to the bug as already filed.  However, as mentioned, Charles took a rather
long "vacation" from PAN for awhile (his right, after all, he's
volunteering, if someone would sponsor PAN's development for a few
months so he could work on it full time, I imagine PAN would quickly show
it).  He has now returned, but in the mean time there have been some
dramatic changes.  Others have been working on finally adding the SQLite
backend database functionality to PAN that's been on the TODO list for
years, and there are dramatic changes coming.  With all that change, I'm
not sure how many of the old bugs, including this one, are even relevant
any more.  To be sure, there'll be even more new ones to work out in the
new code, one of the reasons we don't have a new beta to try right now --
it's simply too unstable and fast changing at the moment.  Anyway, I don't
see that updating this bug at this time would be particularly useful.  If
you wish to be on the bleeding edge, there's always the CVS version you
can try, and work the bugs out of it.  <g>  A number of users are doing
just that, altho I've not tried it just yet.

So...  Try deleting PAN's state info for the group, and see what happens. 
If that doesn't cure the problem immediately, see if it's related to the
i18n bug.  If so, you are probably best using a different reader for that
group for the time being.  If that isn't the issue and deleting the
group state info doesn't cure the problem, I'd still consider trying
another reader to see if it has the problem, but it's very possibly not
PAN, but upline (leafnode or your source server leafnode is connecting to)
that's the issue.

One more thing.  If you don't want to risk your read message tracking and
the like, you can try setting up a new server in PAN, and do your testing
with it instead.  If that cures the problem, you know deleting the state
info is necessary.  If not, you might want to save yourself the trouble of
having to re-track already read messages.  Note that you can try TWO
new and separate servers, one using your current leafnode installation,
the other using its upline source server directly, further troubleshooting
the problem by revealing whether taking leafnode out of the picture solves
the problem or not.

Probably more "help" than you anticipated, but hey... if it helps!  <g>

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman in
http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html
[Prev in Thread]
Current Thread
[Next in Thread]
[Pan-users] odd newsgroup behaviour, R Kimber, 2005/06/27
- [Pan-users] Re: odd newsgroup behaviour, Duncan <=
  - Re: [Pan-users] Re: odd newsgroup behaviour, R Kimber, 2005/06/28
    - [Pan-users] Re: Re: odd newsgroup behaviour, Duncan, 2005/06/28
Prev by Date: [Pan-users] odd newsgroup behaviour
Next by Date: Re: [Pan-users] Re: odd newsgroup behaviour
Previous by thread: [Pan-users] odd newsgroup behaviour
Next by thread: Re: [Pan-users] Re: odd newsgroup behaviour
Index(es):
- Date
- Thread