gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Change in glusterfs[master]: libglusterfs: use safer


From: Anand Avati
Subject: Re: [Gluster-devel] Change in glusterfs[master]: libglusterfs: use safer symbol resolution strategy with our ...
Date: Fri, 30 Aug 2013 14:17:20 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5

[cc'ing gluster-devel on the final consensus]

On 08/30/2013 01:19 PM, Brian Foster wrote:
On 08/30/2013 04:01 PM, Anand Avati wrote:
On 8/30/13 7:50 AM, Brian Foster wrote:
On 08/30/2013 09:51 AM, Brian Foster wrote:
On 08/29/2013 09:08 PM, Emmanuel Dreyfus wrote:
Anand Avati (Code Review) <address@hidden> wrote:

...
TBH, I'm not totally sure how much this impacts things that aren't cross
DSO conflicts, so I ran a quick test. If I define a function in an
executable and a dso and call the function from both points, the library
always invokes the local version if the library is loaded via dlopen().
In fact, if I remove the local version from the library, I hit an
undefined symbol error when I attempt to invoke the library call.

Note that the library does invoke the executable version of the function
if a compile time link dependency is made (i.e., even if the library is
still referenced via dlopen()/dlsym(), but not loaded by that method). I
suspect there is something going on here at compile/link time that
determines whether the executable exports the symbol, but that's an
initial guess based on observed behavior.


FYI, an elf dump of both executables shows that they differ in whether
my duplicate symbol is listed in the executable .dynsym (dynamic symbol
table) section or not.

A dump of my locally installed qemu-kvm executable shows a couple
"exported" block driver symbols: bdrv_aio_readv and bdrv_aio_writev.

Brian

Given that, perhaps the best thing to do is hold off on the
RTLD_DEEPBIND bits until we understand the behavior a bit more
conclusively here and evaluate whether it's really an issue in the weird
qemu case. Thoughts?

(FWIW, I think the change should at least hold with regard to the
RTLD_LOCAL bits. A translator is already intended to be a black box with
a few specially named interfaces. I don't see any reason to pollute the
global namespace with all kinds of extra symbols from different
translators).

NetBSD has different OS-specific flags, but I am not sure wether they
overlap or not:


Appears to define a similar behavior, but this apparently applies to an
explicit search of a symbol in a library as opposed to how the external
dependencies of the library are resolved.

Brian

P.S., All tests run on Linux.

The following special handle values may be used with dlsym():
(...)
           RTLD_SELF       The search for symbol is limited to the
shared
                     object issuing the call to dlsym() and those
shared
                     objects which were loaded after it that are
visible.




Using RTLD_LOCAL solves one half of the confusion, and I think there is
no disagreement that RTLD_GLOBAL must be changed to RTLD_LOCAL. This
solves the problem of qemu-block translator's symbols not getting picked
up by qemu when libgfapi for the hypervisor's calls.


Agreed.

The other half of the problem - functions in qemu getting called instead
of same named functions in qemu-block translator is the open concern.
Brian, you report above that a DSO always uses the version which is
available in the same DSO. If that is the case (which sounds good), I
don't understand why RTLD_DEEPBIND is required.


Not always... I suspect there are two conditions here. 1.) the
executable includes the symbol in its dynamic symbol table. 2.) the dso
is compiled/linked to look for a symbol through the dynamic symbol
tables (I suspect via PLT entries).

The experiment I outlined before effectively toggles #1. By linking the
library directly or not, I'm controlling whether the executable exports
the dependent symbol. #2 it appears can be controlled by something like
the -Bsymbolic linker option. For example, objdump a library with and
without that option and you can see the call sites to a particular
function either refer to library (relative?) offsets or plt entries.

Using this option, the library always uses the local version regardless
of whether the symbol is exported from the executable.

I am concerned about bugs like 994314, where a symbol defined in qemu
(uuid_is_null()) got picked up instead of the one in libglusterfs.so,
for a call made from libglusterfs.so. It may be the case that the
problem occured here because libglusterfs.so was not dlopen()ed, but
dynamically linked as -lglusterfs and build time.


It might not have mattered in that particular case. If the symbol was
exported by the executable (i.e., condition #1), it probably/possibly
could get bound to the unexpected symbol. What was the ultimate fix for
that bug?

While it appears that right now qemu does not export many bdrv_*
symbols, it still seems like it could be a problem if we used those
symbols or the nature of the executable changed in the future (i.e., #1
is not under our control). For that reason, I'd suggest we use something
like -Bsymbolic for qemu-block to address the second half of the problem
(assuming it doesn't break anything else :P). I mentioned this on
#gluster-dev btw, and Kaleb pointed out that this appears to also be
more portable than RTLD_DEEPBIND.


OK. This seems good for now. I will resubmit http://review.gluster.org/5697 to only change RTLD_LOCAL instead of RTLD_GLOBAL (in all places of dlopen()). As a new patch I will send changes to makefile of the qemu-block translator to use -Bsymbolic linker flag, as that might involve additional investigation and testing. In the mean time, can we get review +1 for #5367 and #5366? I think they can go in independently as we have no issues to be used in non-gfapi use cases.

Avati

Thoughts?

Brian

However the text description of RTLD_DEEPBIND makes you think that the
observation of Brian is *not* the default behavior.

Avati






reply via email to

[Prev in Thread] Current Thread [Next in Thread]