pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-users] Re: Is this normal behavior in PAN?


From: Duncan
Subject: [Pan-users] Re: Is this normal behavior in PAN?
Date: Tue, 17 Feb 2009 11:04:30 +0000 (UTC)
User-agent: Pan/0.133 (House of Butterflies)

freeslkr <address@hidden> posted
address@hidden, excerpted below, on  Sun, 15 Feb 2009 05:07:18
+0000:

> Maybe it's this _very_annoying_ bug?
> http://bugzilla.gnome.org/show_bug.cgi?id=533686

Hmm.  I've seen a couple mentions of that behavior, but have never 
actually observed it, neither on either of my two current servers 
(gmane.org and the Cox outsourced to highwinds-media servers) nor 
previously, when I had a subscription to a paid server.

I'd speculate that it may be a problem either due to an optimization bug, 
or due to bugs (optimization or code) in particular versions of 
distribution shipped libraries that pan depends on.

Given that the entire subthread doesn't show up, it would seem to be an 
issue with the way pan threads the messages, and MAY be related to a 
different bug that also occurred on Ubuntu with certain library 
versions.  In that case, pan when run on GNOME, but NOT when run on XFCE 
or on KDE, would stall for some period (like it was in a loop that 
repeatedly hit an assertion and bailed, then hit it again processing the 
next header, IIRC it was a single thread high CPU utilization stall), 
while threading new messages after downloading new headers in a group of 
some size.

As I said, the folks reporting it were all on Ubuntu, at the time 8.04 
but I have no idea whether it was fixed for 8.10 or not, and someone 
discovered that it ONLY happened when running GNOME, NOT when running KDE 
or XFCE on the same installation.  That DEFINITELY points to some sort of 
shared library issue but it was never pinned down.  

Anyway, given that the OP here is on Ubuntu 8.04 and the bug poster said 
Debian, in 8.05, AND that they are both threading bugs, I find the 
coincidence at mimimum "interesting".

So, Rick, are you running GNOME or XFCE or KDE (or something else), and 
does the behavior change if you switch desktop environments or not?

Also, to everybody running Ubuntu 8.10 or the 9.04 pre-releases (alphas/
betas/whatever-they-are-at-this-point), does the problem still occur?  If 
not, the domain of possibly culprit libraries is definitely limited and 
it should be relatively easy to pin down to a specific library and 
version or set of versions.  Once that has been done, it may be possible 
to upgrade just that single package -- or if not possible due to ABI 
incompatibility, possible for someone to compile the upgrade and perhaps 
make it available to anyone needing it who trusts them not to malware it, 
at least.

The speculation would be that there's a namespace collision and two 
different libraries (perhaps different versions of the same one) 
providing incompatible functions with the same name.  When GNOME loads, 
it loads the one.  Then pan loads, and uses its incompatible functions 
(incompatible because it was built against the other library's headers) 
instead of loading the compatible functions of the same name from the 
other library.  But it doesn't abort because all the functions it needs 
are there.

Then when it comes to actually calling the bad functions, it would 
normally segfault, but being a well behaved C++ app, pan has assertions 
set to catch such unexpected problems (instead of causing a security 
issue or messy segfault as would be likely without them) and they 
trigger, throwing pan into an error recovery routine, which copes by 
dropping its attempt to thread that message.  Pan then goes on to the 
next message.

In a group with enough headers, this might trigger often enough to cause 
pan to go lock at 100% CPU for the 20 minutes or so that people were 
reporting for the other bug, but if there's just a few, the extra 
processing time wouldn't be noticed but the threads wouldn't show.

If a redisplay is triggered, a different code-path is used and the 
problem functions never called, so the messages suddenly show up.  It's 
worth noting that for efficiency reasons pan only threads headers once, 
when they first come in.  That they appear on redisplay therefore 
indicates it's not the threading itself that gets botched, but rather the 
display of said threading as it occurs.  Again, the redisplay apparently 
uses a different code-path which doesn't call the problem functions, so 
it works fine.

If a desktop environment other than GNOME is used, the incompatible 
version of the library won't normally be in memory (possibly with some 
exceptions if other apps are using it), and pan (or more accurately the 
glibc loader lib, ld.so.*) will find and load the compatible version of 
the library.  Since it'd be unlikely (impossible?? maybe possible with 
fast user switching??) to have pan stay loaded while then loading GNOME, 
the effect of the pan-compatible version of the library on GNOME's 
behavior should the pan version be loaded first, remains unknown.

Assuming that it IS a variation on the same bug, and I'll be pretty close 
to convinced if it ends up that switching desktop environments changes 
the behavior of this bug too, it's likely that a diff of the output of 
"ldd pan" run from a terminal window in GNOME, against the output of the 
same command run from a terminal window in XFCE or KDE, will point to the 
culprit library.  If it doesn't, it's because there's a piece of the 
puzzle I'm not aware of yet, either in the way library loading and ldd 
works (I'm most definitely NOT an oracle on the subject), or some aspect 
of the bug that's more complicated than I'm speculating and that's likely 
beyond my ability to understand at this time.  But it /should/ work, 
given what I know and understand at this moment.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman





reply via email to

[Prev in Thread] Current Thread [Next in Thread]