pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] error: plonked posters posts' showing up as new


From: Duncan
Subject: Re: [Pan-users] error: plonked posters posts' showing up as new
Date: Mon, 21 Mar 2022 06:31:13 -0000 (UTC)
User-agent: Pan/0.150 (Moucherotte; 7b0b3fc12)

David Chmelik posted on Sun, 20 Mar 2022 02:29:35 -0000 (UTC) as
excerpted:

> Usenet may again be a little better than it was in mid-to-late 1990s in
> terms of spam--some newsgroups have no spam--but unfortunately once
> again others still are almost all spam.
> 
> So, I have a large killfile again and will be plonking more advertisers/
> pr0n & drug & weapons dealers, trolls, proselytizer religious fanatics,
> etc..
> 
> However I noticed what often happens is I get a large number updates.  I
> go to those groups then see sometimes all posts are by plonked posters
> with spam subject lines... just for a split-second, then disappear.
> 
> Since I'm now subscribed to over 1000 newsgroups (if you add Usenet and
> Gmane) seeing all those false updates wastes considerable time.
> 
> Shouldn't the logic be fixed to omit those before they ever even get
> counted as updates, so you don't waste a lot of time still seeing dozens
> spam updates?

There's several factors to consider here, some of which are inherent in 
the news protocol and thus not something pan can do anything about.

First of all, there's a quick and very bandwidth efficient counts update 
mode (which I'm not actually sure pan uses at all) whereby group message 
counts can be updated quickly, with little bandwidth usage and very little 
additional information (no headers, etc).  This simply asks the server for 
the first and last message sequence numbers it currently has in whatever 
group(s) and compares them to the message sequence numbers the client 
already knows about, so it can update the count of unread messages 
accordingly.

However, the result is always the *maximum* number of potential messages 
available, not necessarily the number *actually* available.  In 
particular, some servers assign message numbers before they do their 
filtering if any, and some messages may simply be gone from the server due 
to server-policy-specific spam filtering, copywrite or COPA takedown 
orders, message cancels, no-carry policies like binaries posted to 
anything out of of the alt.binaries.* hierarchy (which can affect binaries 
groups too if the post was cross-posted to non-binaries groups), etc.  
These will appear in the initial counts but not actually be available.

Second, there's overviews mode, aka downloading "headers".  But, this does 
*NOT* download true headers.  Rather, it downloads an abridged version 
containing only the most common headers typically used for display of the 
message list.  This typically includes From, Subject, Size/Lines, 
References (necessary for threading), and Message-IDs, and server admins 
can configure it to include others if they wish, but it does *NOT* 
normally include less common/useful headers such as organization, custom 
headers, etc.

This affects scoring/watching/killfiling in that headers available in the 
overview can be scored against with just the information in the overview, 
that is, without downloading the actual message, while those not in the 
overview require actually downloading the message to apply that bit of the 
score.

Of course it's far better to be able to score without downloading, thereby 
making it possible for killfiles to avoid downloading the message 
entirely, but for nym-switching posters in particular that's not always 
possible, yet there's often still something scoreable in the full headers 
(or body content) and being able to auto-ignore those posts even if they 
have to be downloaded to do it can still be quite useful.


So depending on what headers exactly you're scoring on, or even depending 
on how the server does its numbering and filtering, you may see quite a 
number of messages that pan can't preemptively do anything about, until it 
gets more information, either downloading "headers" (actually overviews), 
or for headers not in the overview, even downloading the entire message.


Meanwhile, particularly if your scorefile is large and not efficiently 
structured, processing it will take some time too.  Here's a short example 
from my (very dated now because as I've posted before, I've not been 
active in the binaries for years, could actually be over a decade now) 
pr0n scorefile:

[alt.*]
Score:: =-9999 %Alt kill
        From: Seeking teens
        From: teens seeker
        From: sex coed
        From: NudeGirls

        Subject: R/-\\PE
        Subject: R/-\|PE

That's going to be **FAR** more efficient than individual score entries 
for each of those.  And note that they're headers that should be in the 
overview as well.

If your scorefile looks more like it's going to if you've only added 
entries from the pan GUI and never text-edited them into something more 
efficient like the above, and if you're doing over 1000 groups as you 
mentioned, you could *easily* have tens of thousands of individual single-
entry scores that can be combined into a rather more efficient say 100-200 
compound-entries like the above.  I've never let mine get overgrown and 
really haven't done anything lately with it at all, so I can't do any 
before/after comparisons, but I'm guessing it could make the difference 
between seeing some of the killfiled posts momentarily while pan processes 
the inefficient mess, and having them all processed before it displays 
anything (especially on a fast machine with plenty of RAM and NVDIMM 
storage, something my now decade-old machine is lacking, tho I did do the 
SSD upgrade from spun-glass on the SATA3s).

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]