[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-users] Re: Is this normal behavior in PAN?
From: |
Duncan |
Subject: |
[Pan-users] Re: Is this normal behavior in PAN? |
Date: |
Tue, 17 Feb 2009 11:04:30 +0000 (UTC) |
User-agent: |
Pan/0.133 (House of Butterflies) |
freeslkr <address@hidden> posted
address@hidden, excerpted below, on Sun, 15 Feb 2009 05:07:18
+0000:
> Maybe it's this _very_annoying_ bug?
> http://bugzilla.gnome.org/show_bug.cgi?id=533686
Hmm. I've seen a couple mentions of that behavior, but have never
actually observed it, neither on either of my two current servers
(gmane.org and the Cox outsourced to highwinds-media servers) nor
previously, when I had a subscription to a paid server.
I'd speculate that it may be a problem either due to an optimization bug,
or due to bugs (optimization or code) in particular versions of
distribution shipped libraries that pan depends on.
Given that the entire subthread doesn't show up, it would seem to be an
issue with the way pan threads the messages, and MAY be related to a
different bug that also occurred on Ubuntu with certain library
versions. In that case, pan when run on GNOME, but NOT when run on XFCE
or on KDE, would stall for some period (like it was in a loop that
repeatedly hit an assertion and bailed, then hit it again processing the
next header, IIRC it was a single thread high CPU utilization stall),
while threading new messages after downloading new headers in a group of
some size.
As I said, the folks reporting it were all on Ubuntu, at the time 8.04
but I have no idea whether it was fixed for 8.10 or not, and someone
discovered that it ONLY happened when running GNOME, NOT when running KDE
or XFCE on the same installation. That DEFINITELY points to some sort of
shared library issue but it was never pinned down.
Anyway, given that the OP here is on Ubuntu 8.04 and the bug poster said
Debian, in 8.05, AND that they are both threading bugs, I find the
coincidence at mimimum "interesting".
So, Rick, are you running GNOME or XFCE or KDE (or something else), and
does the behavior change if you switch desktop environments or not?
Also, to everybody running Ubuntu 8.10 or the 9.04 pre-releases (alphas/
betas/whatever-they-are-at-this-point), does the problem still occur? If
not, the domain of possibly culprit libraries is definitely limited and
it should be relatively easy to pin down to a specific library and
version or set of versions. Once that has been done, it may be possible
to upgrade just that single package -- or if not possible due to ABI
incompatibility, possible for someone to compile the upgrade and perhaps
make it available to anyone needing it who trusts them not to malware it,
at least.
The speculation would be that there's a namespace collision and two
different libraries (perhaps different versions of the same one)
providing incompatible functions with the same name. When GNOME loads,
it loads the one. Then pan loads, and uses its incompatible functions
(incompatible because it was built against the other library's headers)
instead of loading the compatible functions of the same name from the
other library. But it doesn't abort because all the functions it needs
are there.
Then when it comes to actually calling the bad functions, it would
normally segfault, but being a well behaved C++ app, pan has assertions
set to catch such unexpected problems (instead of causing a security
issue or messy segfault as would be likely without them) and they
trigger, throwing pan into an error recovery routine, which copes by
dropping its attempt to thread that message. Pan then goes on to the
next message.
In a group with enough headers, this might trigger often enough to cause
pan to go lock at 100% CPU for the 20 minutes or so that people were
reporting for the other bug, but if there's just a few, the extra
processing time wouldn't be noticed but the threads wouldn't show.
If a redisplay is triggered, a different code-path is used and the
problem functions never called, so the messages suddenly show up. It's
worth noting that for efficiency reasons pan only threads headers once,
when they first come in. That they appear on redisplay therefore
indicates it's not the threading itself that gets botched, but rather the
display of said threading as it occurs. Again, the redisplay apparently
uses a different code-path which doesn't call the problem functions, so
it works fine.
If a desktop environment other than GNOME is used, the incompatible
version of the library won't normally be in memory (possibly with some
exceptions if other apps are using it), and pan (or more accurately the
glibc loader lib, ld.so.*) will find and load the compatible version of
the library. Since it'd be unlikely (impossible?? maybe possible with
fast user switching??) to have pan stay loaded while then loading GNOME,
the effect of the pan-compatible version of the library on GNOME's
behavior should the pan version be loaded first, remains unknown.
Assuming that it IS a variation on the same bug, and I'll be pretty close
to convinced if it ends up that switching desktop environments changes
the behavior of this bug too, it's likely that a diff of the output of
"ldd pan" run from a terminal window in GNOME, against the output of the
same command run from a terminal window in XFCE or KDE, will point to the
culprit library. If it doesn't, it's because there's a piece of the
puzzle I'm not aware of yet, either in the way library loading and ldd
works (I'm most definitely NOT an oracle on the subject), or some aspect
of the bug that's more complicated than I'm speculating and that's likely
beyond my ability to understand at this time. But it /should/ work,
given what I know and understand at this moment.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman